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TO THE INSTRUCTOR 


This program introduces the student to the elements of statistical reasoning and the 
manner in which this form of reasoning enters into the process of behavioral research. 
The material is presented in a carefully pre-tested sequence. Fundamental topics 
such as variables, values, and distributions are broken down into a series of small, 
sequentially organized steps. The student is led into discussion of such relatively 
complex concepts as decision rules and the probability of Type One and Type Two errors 
only after the prerequisites for his comprehension have been developed in detail. 


After dealing with data as a collection of observed values of a variable, distributions 
of values are considered along with the manner in which descriptive statistics characterize 
various features of these distributions. Formulas for the mean and variance are developed 
in detail, and in a way that clearly indicates the exact feature of'a distribution represented 


by each expression. 


The concept of a sampling distribution is the central theme of the sections on 
statistical inference. Thus, the effects of sampling procedure and sample size in 
determining distributions are discussed in detail. In addition, the role of probability 
theory in the calculation of theoretical sampling distributions is considered as a basis 


for using random sampling procedures. 


In programmed learning, each student proceeds at his own rate. In addition, the 
material is constructed so that the student actively participates in the learning process, 
supplying answers which require his understanding of that item. The answers constitute 
immediate feedback and reinforce the student's correct responses. 


Periodic tests covering the material are provided to allow the student to evaluate 
his understanding as he progresses. Two forms of these exams are included in the 


instructor's manual. 


TO THE STUDENT 


Your programmed text is quite different from an ordinary book. A program 
consists of a large number of "frames, " or numbered statements, each of which 
tells you something and asks questions about the material you have learned. The 
frames introduce new material a little at a time and review old material as needed 


to make sure you will remember it. 


You do not study a program in the same way you study a book. The program is 


designed to let you work at your own rate of speed. 


Get ready to use your program by covering the answer column at the right with 
the slider provided for you. Next, read the first frame and write the answer either 
in the blank or on a separate piece of paper. Then, move your slider down to uncover 
the answer and see if you are right. Go on in the same way to the following frames, 
checking your answer to each frame before going on to the next. Practice with the 


following frames. 


1. When a blank has nothing under it, you simply fill in 
whatever best fits the blank. For example, the day of 


week that follows Thursday is S Friday 
2. When 8 blank has two answers underneath it, you select 

the one that best fits. For example, a dog an is 

animal 18/1s not 


You will find that the left-hand pages of the program are upside-down and backwards. 
Pay no attention to them until you have finished all of the right-hand pages. 


Turn to Frame landbegin. . . . . . . 


1. 


2. 


4. 


SectionI: Data 


Suppose you were a psychologist studying 

how accurately a person judges distance. You might 
place a target some distance in front of a subject and 
ask him how far away it was. You would be interested 
in the difference between the distance he reported and 
the true of the target. 


You might decide to move the target after each of the 
subject's judgements, asking him to judge each new 
distance. In your experiment, you would be collecting 
many different judgments of distance by changing or 
varying the between the subject and 
the target. 


We refer to something that does not change during an 
experiment as a constant. Since the distance between 
the subject and the target is varied or changed, it 

be a constant in the experiment we just 


would/ would not 


described. 


We refer to the opposite of a constant as a variable. 
Something that changes in an experiment is called a 
variable. In the experiment we just described, the 
distance between the subject and the target is varied; 
therefore, we would refer to the varying distance as a 


constant/ variable 


distance 


distance 


would not 


variable 
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5. 


6. 


7. 


8. 


9. 


The target distance would be called a variable in 
your experiment because this distance was 


during the experiment. 


As a psychologist, you might wish to study how 
accurately a person judges weights. You might give 
the subject a lead ball and ask him to fill up a bag with 
sand until the bag felt as heavy as the lead ball. When 
he was finished trying to match the weight of the sand 
bag and the lead ball, you could compare the 

of the bag of sand and the weight of the lead ball. 


You could repeat this procedure several times, each 
time changing or varying the size of the lead bal, In 
this experiment, the weight of the lead ball would be 


constant/ variable 


referred to as a 


The weight of the lead ball was a variable in this 


experiment since this weight was 


during the experiment. 


Suppose you conducted the experiment differently. 
Suppose you gave the subject the same lead ball each 
time. You would not expect the subject to produce a 
bag of sand weighing exactly the same amount each 
time, even though the of the lead ball 
remained the same throughout the experiment. 


If you weighed each bag of sand carefully, you would 
doubtlessly find that the weight differed slightly for 
each bag. Therefore, in this experiment, the weight 


of the lead ball was a whereas the 
constant/ variable 
weight of the bag of sand the subject produced each 


time wasa s 
constant/ variable 


changed 


weight 


variable 


changed 


weight 


constant 


variable 


uto 


Uloput*.r 


LLY 


Ra "epi UOTSTATD "p 
-əma uonoedjqns “ə 

‘əma uoreor[dnnur “q 

emt uonmppe 'e 


təm Sursn Aq pojnduroo aq prnom zods ouo 
He) əzour Sururejqo pue erp €e Juro Jo Ajtrrqeqoad ou, 


ƏAOQE ay} yo əuou 'p 

"eAnsneuxe pue BAISNTOUT "ə 
"ƏATSnTəxə Á[[enjnur pue aatjsneyxe “q 
"Əşərduroə pue eArsn[our e 


:aq 3snul yt *ooeds 
ejdures € 107 3sT[ € 9q 03 sauroojno JO JST] € 10] 1op.o uy 


“9 


*HOIOHO W'IdLL'IQnNWN 


9q prnoA soeds o[dures e 
Jo zəqurəur Aue o) u3rsse pinoə not Iequinu jsoS3.1ev[ au, 


03 dn ppe 
TIT uornqrrjsrp Ajriqeqoad e ur senmqeqo.d əy} yo Ty 


jouuvo /ueo 
"ƏATSnTəxə Á[[enjnur aq 03 pres aq ‘STEL 


Io Sptau ‘UO? € yo SSO) ay} WOI sour023no om} au 


*Ssooo1d € Se pəqrrəsəp aq uajjo ued souroojno 


Surusəəuoə Ájurejreoun sr ərəm qoruA ur ssooo1d y 


RE 


*SHNV'IS HHL NI TILA 


AI MANAA 


10. 


11. 


12. 


13, 


14, 


As a psychologist, you might be interested in how well 
people can remember things. For example, you might 
read a list of six letters of the alphabet to a subject, 
such as m, t, s, p, g, K, and then ask him to repeat this 
list. Suppose the subject said that the letters were m, 
t s, p g, d. He would have repeated only the first 


letters correctly. 
4/5 


Suppose you then read another list of 6 letters — for 

example p. f, t, m, s, r. If subject's response was 

p f t g, f n he would have repeated only the first 
letters correctly. 


You could conduct an experiment in which you gave the 
subject many different lists of six letters, each time 
asking him to repeat as many of the letters as he could. 
While the actual list of letters would be varied in this 
experiment, the number of letters in each list would 


always be six. 


Since the number of letters in each list would always be 


constant/ variable 


six, the number of letters would be a 
in this experiment. 


Every time you give the subject a list of letters, you 
could record how many letters he repeated correctly 
before making anerror. You would call this his score 
on each list. Therefore, if he repeated four of the six 
letters correctly before making an error, his score for 


that list would be 5 


If you read the subject the letters m, p, t, x, z, 8, and 
his response was m, p, t, x, z, d, his score on that 
list would be . 


constant 
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20. 


Since every list contains 6 letters, whenever the subject 


repeated the list without making an error, his score 
would be . The worst possible score he could 
make would be zero, which would mean he had repeated 


of the letters correctly. 


none/all 


You know, therefore, that all of the subject's scores will 


will be no greater than and no less than . 


Because the subject! s score can vary or change from 
list to list, it would be called a in this 


experiment, 


Considering his score as a variable, list below the 
possible scores that the subject could make (beginning 
with the worst score and moving in order to the best). 


We will refer to each of these possible scores from 

0 to 6 as a possible value of this variable. Thus, the 
smallest possible value of this variable is ` , and 
the largest possible value is 2 


The subject receive a score of 10 


could/ could not 


because the list contains only six letters. "Therefore, 
10 a possible value of this variable. 


is/ is not 


We have used the word to refer to 
things that may change or vary during an experiment. 
We use the word , however, to refer to 


things that do not change during an experiment. 


none 


variable 


05/15:2, 73,44, 5,4 


could not 


is not 


variable 


constant 
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One of the variables we considered was distance. A 
value of that variable would be any particular distance. 
Since 10 feet is a particular distance, it would be a 

of that variable. 


Another variable we considered was weight. Values of 
that variable could be particular — — such 
as 10 pounds, 2 ounces, 3 pounds, etc. 


There are many different things which could be 

variables in an experiment, All variables, however, 

have one thing in common. They are things that may 
during an experiment. 


Since many things may be variables, it is often useful 
to give each variable a name. For example, "distance" 
is the name we used for one variable we considered, 
whereas "weight" is then of another 


variable we considered. 


Any particular distance is a value of the variable named 
"distance." Anyp weight is a value of 
the variable named "weight." 


Two feet be a value of the variable 


would/ would not 


named weight. 


In the experiment in which the subject tried to repeat a 
list of six letters, we considered a variable named 
"Score." The subject's "score" was the number of 
letters he repeated correctly before making an error. 
The different possible of this variable were 
0, 1, 2, 3, 4, 5, 6. 


value 


weights 


change 


name 


particular 


would not 


values 
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Therefore, if the subject's score changed from 3 to 5, 
we would say that the of the variable 


value/name 


had changed. 


You have learned that a is something 
that can change during an experiment. Every time a 
variable changes, it changes from one to 


another. 


Psychologists have studied how fast a rat will run 
down a narrow alley to secure food. The picture below 


shows an experimental setup you might have used if you 


were studying running speeds of rats. 


RAT RUNWAY 


value 


variable 


value 
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(continued) 


The rat is placed in one end of the runway and food is 


placed in the end of the runway. 
same/ other 


When the door is opened in front of the rat, he is free 
to move all the way to the far end of the runway to 


secure the D 


You would be interested in the time between the opening 
of the and the time when the animal 
reached the food. 


Let" s imagine that you are conducting the following 
experiment. Suppose you took a rat that normally ate 
four ounces of food a day and that had never been in the 
runway before. You begin to feed the rat only in the 
runway. At precisely the same time each day, you 
place the rat at the starting point of the runway, four 
ounces of food at the other end. Then, you open the 
door of the runway. Each day, you would record the 

between the opening of the door and 
the rat" s reaching the food. 


In this experiment, you would be interested in a 


named running time. 


variable/ constant 


The amount of food placed in the runway each day 
would be a in this experiment. 


variable/ constant 


other 


food 


door 


time 


variable 


constant 
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34. Suppose you repeated this experiment for 10 days. 
The record of your results might look like this: 


RUNNING TIME 

200 sec. 

100 sec. 

150 sec. 

80 sec. 

40 sec. 

41 sec. 

15 sec. 

10 sec. 

4 sec. 


= iə) 
< 


3 sec. 


On day 1, the rat took 200 sec. from the time the 
door opened to reach the food. On day 2, he took 


100 sec. On day 3, he took sec. 150 
35. The rat' s running time on day 10 was ` 3 sec. 
36. The rat" s running time was on day 7 shorter 
longer/ shorter 
than on day 3. 
36. Running time was on day 3 than longer 
longer/ shorter 
on day 2. 
37, Instead of referring to "the variable named running 


time," we shall simply say "running-time" variable. 


Thus, the variable we call" running time" will be 


referred to as the" = s running-time 
variable, 
38. Particular running times are of the values 


"running-time variable". 
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39. 


40. 


41. 


42. 


À large part of any scientist! s work consists of 
observing the variables in experiments and making 


records of these observations. 


Psychologists are chiefly interested in behavior — 
either the behavior of human beings or the behavior of 
animals. "Therefore, if you were a psychologist, a 
large part of your work would probably consist of 
observing b and making records of 


these observations. 


You might be interested in observing how quickly a 
person could solve a mathematical problem, how 
accurately he could estimate the weight of some 
object, or how much time he spent sleeping. What- 
ever behavior you were interested in, you would 
probably make ob of this behavior 
and records of your ob . 


Each of the variables we have discussed has had 
different possible values. For example, if you tossed 
a coin in the air and it fell on one side or the other, 
the two possible results you could observe would be a 
"head" facing up ora" " facing up. These 
two possible outcomes are the possible values of the 
variable we would call "results of a coin flip, " or 
perhaps "falls of a coin." 


What if the coin were tossed three times and you 
observed that it fell with the "head" facing upwards all 
three times? The observed values of the variable "falls 
of a coin" would be "heads," "heads," and "heads." 
Even though you did not actually observe a "tail" the 
two values of the variable are still 


possible/ observed 


"heads" or "tails." 


behavior 


observations 


observations 
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Suppose you were conducting a telephone survey in 
which you asked each person you called whether or 

not he had been watching television. Assuming each 
person answered the question, the two possible answers 
are "yes" or "no." Thus, the two possible values of 
the "answer" variable are" Sand" M 


Suppose four people were called in this telephone survey. 
Suppose you observed that the first person said "yes, " 


the second person said "no," the third person said "yes," 


and the fourth said "yes." The record of your 


observations would be: 


Response 
İst person yes 


2nd person no 
3rd person yes 
4th person yes 


These four ansveers are the values 


observed/ possible 


of the variable under study. 


Imagine that you were learning how to bowl and that you 
decided to keep track of how you improved with each 
lesson. Asatest of your skill you could bowl 5 balls 
following each lesson, After each ball you could reset 
any pins you had knocked down so that exactly 10 pins 
were standing each of the five times you rolled a ball. 
The most pins you could knock down with any one ball 
would be pins. The worst you could do with 


any one ball would be to knock down pins. 
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46. 


47. 


48. 


- down with the 4th, and 


If we consider the number of pins you knock down with 
each ball to be a variable, the smallest possible value 
of the variable would be and the largest possible 


of the variable would be 10. 
List all the possible values of the " pins-knocked-down" 


variable, starting with the smallest possible value and 
ending with the largest possible value. 


— psu] 155 et — Tn ee ə — — — — 


Suppose after the first lesson you rolled 5 balls with the 
following results: 


PINS KNOCKED DOWN 
1st Ball 0 
2nd Ball 0 
3rd Ball 2 
4th Ball 10 
5th Ball 5 


According to this list of results, you knocked none of 
the pins down with the first ball, none down with the 
second ball, down with the 3rd ball, 

down with the 5th ball. 


Thus, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10 are the 
values of the variable we are 


possible/observed 


studying (the "pins-knocked-down' variable), and 0, 0, 
2, 10, and 5 are the values of that 


possible/ observed 


variable. 
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value 


2, 10 


possible 
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51. 


52. 


53. 


Each observed value of a variable is referred to 

as a single observation. Therefore, in the illustration 
we just considered, the five observed values of the 
"'pins-knocked-down" variable would be referred to as 


five . 


If you were studying how fast a rat ran down an alley to 


` secure food and observed that it took him exactly 


10 seconds, this time would be an observed value of 


the "running-time" variable and be 
would/ would not 


referred to as a single observation. 


It is possible the rat could take as long as 20 minutes to 
reach the food. Suppose, however, that the longest 
observation of running time was 3 minutes. Therefore, 


whereas minutes would be a possible value of the 
3 
running-time variable, minutes would be both a 
20 


possible value and an observed value. 


When a particular variable is observed or stüdied in an 
experiment, records are made of the observed values 
of this variable. These records are called data. For 
example, we just considered an experiment in which a 
rat ran down an alley to get food. The time it took for 
the rat to reach the food each day was observed and 
recorded. Our records of these running times are the 


d from that experiment. 


Another experiment we discussed dealt with a subject' s 
repeatedly attempting to match the weight of a lead ball 
by filling a bag with sand. Records of the weight of 
each bag of sand the subject produced are the 


from that experiment. 
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observations 


would 


20 


data. 


data 
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54. 


55. 


56. 


57. 


58. 


EC 


Notice that we referred to the values 


possible/ observed 


of the variable as the data. A list of the possible values 
of a variable would not be considered data. 


Suppose you tossed a coin two times and the observed 
value of the variable "falls of the coin" was "heads" on 
both tosses. Then "heads" and "tails" are the two 

values of the variable, whereas 
observed/ possible 


"heads" and "heads" are the two 


possible/ observed 


values. 
In the preceding frame, the list of values we would call 
"heads" and "heads"/"heads" and "tails" 


Ki Heads" and "heads" would be considered data since 
they are the values of the variable. 


possible/ observed 


data was 


Television stations are naturally interested in which 
programs are preferred by television viewers. Imagine 
that you were hired to determine viewing preferences 

in an area in which there were only three television 
channels: Channel 5, Channel 7, and Channel 9. 
Suppose you asked a number of television viewers the 
following question: "If you had to watch only one of 

the three television stations for a week, which one 
would you select?' Acceptable answers to these 
questions would be Channel or ox s 
If you considered the viewer‘ s answer as a variable, 
the three possible values of the variable would be 5, 

7, and . 
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possible 
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"heads" and "heads" 
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While you probably would ask many people this question 

in a real study, let us suppose you asked only three 

people and the first person selected 7, the second 

person selected 9,and the third person selected 9. The 

values 5, 7, and 9 would be the possible 


possible/ observed 


values of the" answer" variable, whereas the values 7, 
9, and 9 would be the values of the observed 


variable. 


Since an observation is any observed value of a variable, 


the three observations are j , and 9. 174.9 
Your data would be the values a , and ; “ə 
since these vere the values of the observed 


observed/ possible 


"answer" variable. 


In the television survey, the answer of each of the 


three people was a but the question 
constant/ variable 


they were asked was a 


constant/ variable 


variable 


constant 


It is often useful to distinguish between continuous and 

discrete variables. If you were to count the number of 

people in a room, it would be possible to have 8 people 

or 9 people, but it would not be possible to have sł 

people. It is this characteristic of the variable named 

"number of people in a room" which makes it a discrete 

variable. The variable is discrete since there are no 

values of the variable between 8 and 9, or between 10 

and 11, or between 12 and A 13 
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63. 


(Continued) 


On the other hand, the variable we call "length" is an 
example of a continuous variable. No matter which 
particular pair of lengths you considered, it 

be possible to imagine a length between 


would/ would not 


these two lengths. For example, 8 i inches is between 


8 inches and inches. (By "between", we mean 
7/ 9 


greater than one length but less than the other. ) 


Similarly, 2 1 teet is a value of the variable called 
"length," which is b the values 2 feet and 
3 feet. In other words, 2 š feet is larger than 2 feet 


but less than 3 feet. 


A continuous variable has an unlimited number of values 

because no matter how close two values are to each 

other it always possible to imagine another 
‘is/is not 

value which would lie between them. For example, 

even though 2. 10 inches and 2. 20 inches are close 

together, inches is between them. On the 


2.15/2.25 


other hand, if the variable you are studying is discrete, 
you can find two values of the variable such that there 
is no value between them. For example, the number 
of pennies you have in your pocket would be a value of 
a variable, since you could only 
discrete/ continuous 
have in your pocket one penny, two pennies, three 
pennies, and so on, Of course, you could not have any 


number between these values. 
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67. 


No matter how similar the weights of two objects, it would 
always be possible to imagine an object whose weight 

was less than one but greater than the other. Therefore, 
"weight' is an example of a variable. 


discrete/ continuous 


It will not be necessary to distinguish between discrete 
and continuous variables very often. Most of the 
illustrations in this text involve discrete variables. 
Those illustrations that involve continuous variables 
are treated as if they were discrete variables. For 
example, if you measured a person' s height to the 
nearest inch, you might say he was 65 inches tall or 
66 inches tall, because you "round off" his "height" to 
the nearest inch, you would never say he is 65 i inches 
tall In other words, you are treating the 


variable called "height' as if it 
continuous/ discrete 
were a variable. In other words, 


discrete/ continuous 


you are pretending that there are no values between 
65 inches and 66 inches, or between 67 inches and 


68 inches, and so on. 


To determine the number of people in a room, you 
would count them. To determine the number of eggs 
in a basket, you would count them. "Whenever you 

things, you are determining how many 
(what number) of things there are. 


Any collection or group of things can be counted. The 
procedure we call counting tells you the number of 
things that are in the group or collection. For example, 
we can determine the of windows in 


a building by counting them. 
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69. 


70. 


us 


It is often useful to distinguish between the name of 
something and the thing itself. A "name" is something 
you speak, write, or read. For example, "Boston" is 
the name of a city. You could go to Boston, walk through 
the streets of Boston, or even live in Boston. On the 
other hand, the name "Boston" is something you speak, 
write, or read. Similarly, if you had a dog named 


"Rover, " you might scratch the behind the ears, 
dog/name 
whereas you might print his on his dog house. 
dog/name 


Suppose a mother didn't decide to name her new baby 
"Kendall" until three days after the baby was born. The 
would be three days old before it was given a 


baby/name 


name. 


We will use quote, " ", when we are referring to the name 
of something rather than to the thing itself. Thus, in the 
previous example we would refer to the baby as Kendall 

and to the name the baby received as "Kendall." Similarly, 
when we speak of the city of Boston and of the name 


Boston/"'Boston" 


and that you write the name as part of an 


"Boston" /Boston 


"Boston, "we willsaythat you could live in 


address on an envelope. 


The number of things in a group or collection is a 
characteristic of that group of things, just as the color of 
a person's hair is a characteristic of that person. We 
refer to a person's hair color with a name, such as "red, " 
"brown, " and "black." Similarly, we refer to the number 
of things in a group with a name, such as "five, " "twenty, " 
"eighteen, " and "forty." Thus, "red" would be the name 
of a particular hair color and "five" would be the 


of a particular number. 
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72. 


74. 


75. 


76. 


71. 


Names for other are "two, " "6," numbers 


"four, " and "one hundred. " 


There are different ways of representing the same 

number. For example, the of number 
players on a basketball team could be represented 

either by the name "five' or by the name "5" or by 

the name "V." 


Number is a characteristic of a group or a collection 

of things. The names (such as "ten, " "four, " "6") 

that are used to represent this characteristic are often 

called numerals. Therefore, the number of players 

on a basketball team could be represented either by 

the numeral"5" or the Roman numeral . V 


Since the names "red," "green," "Democrat," and 

"Boston" are not the names of numbers, they are not 

numerals. Therefore, of the two names "6" and "blue," 

" " js a numeral since it is the name of a 6 
number 


We pointed out earlier that there is a difference 

between the name of something and the thing itself. 

A numeral is the name we give to a number. You 

determine the of things in a group number 


number/numeral 


by counting them; you represent this characteristic of 

the group by writing or saying a e numeral 
"number/ numeral 

The same number can be represented by different 

numerals. "Four," "4," and "IV" are three names for 

the same number, in other words, they are three 


numerals vhich represent the same ; number 
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79. 
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83. 


84. 


Number is often referred to as a variable — for 
example, the number of autoinobile accidents in 
California each year, the number of people who vote 
in an election, or the number of base hits a baseball 
player makes each game. In each of these cases, 
number could be thought of as a 


Like any other variable, the variable we call number 
has different particular values. Just as "red" and 
"greem' are names of particular values of the variable 
we call "color," — "three, " "4," "20, " and "ten" are 
n of particular values of the variable we 


call number. 


Consider the following list of letters: b, c, m, p. We 
could determine the of letters in 
this list by counting how many there were. The name 
(numeral) we would use to represent this number would 


usually be" m 
"Four" and"4" are two names for the same z 


Numerals are simply names we use to represent 
different values of the variable we call 5 


Roman numerals are names for particular numbers. 
Thus, the Roman numeral III represents the number 


" 


we usually represent by the numeral" . 


Numerals are often used to represent characteristics 
of things other than number — such as weight, length, 
temperature, age, and so on. When we say that the 


temperature is "20 degrees below zero, " we are using 
the "20" as a name for a particular 


number/numeral 


value of the variable we call temperature. 
19 
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85. When we use the names "twenty inches," "two feet," and 
"ten feet," we are using the numerals 


numerals/numbers 


"twenty," "2," and "10" as names for particular values 
of the variable called length. 


86. Football and baseball players often have 
written on the backs of their numerals 


numerals/ numbers 


uniforms to help people in the stands to identify the 
different players. 


87. It is important to be careful when numerals are used 
to represent characteristics other than number. It 
makes sense to say that twenty things are twice as 
many things as ten things. It is also appropriate to 
say that something weighing twenty pounds is twlce 
as heavy as something weighing ten pounds. It 
necessarily make sense to say that a does not 


does/ does not 


baseball player with the numeral "20" written on his 
jersey is twice as good as the baseball player with the 
numeral"10" on his jersey. The fact that one player 
was assigned the numeral" 10" and another assigned 
the numeral"20" does not necessarily imply anything 
more about the players themselves than does the 
difference in their names (John and Charles, for 


example). 


88. The value of a variable can often be represented by 
numerals. We shall refer to these variables as 
numerical variables. For example, we discussed an 
experiment earlier in which we were interested in the 
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time it took a rat to run down an alley to reach food. 


s Š 
zə 


On 

N 

X 

nM 
S8T-6/147;, 


i) 
E 
RÀ 


Z^ e 


dəəə 


6h ‘IP 


3əəfər 


Tensnun 


TF 


ev 


6SP 


*syuƏrəA Jo dnosz3 sty} Surje[ost Sau 


qoofar ek 


TEDHA IYJ uo9A39q u93jLIA ST QOTUA Å prom 

əm Aq pərmuəpi st syystom Jo dnoa3 sty} “əcoyərəqL 

-stsayjodéy ay} yoalar o) sn asned o) (aynyoered 

9AT199J9p-uou € zog) [ensnun Anuərərys pereprsuod 
9q30u prnom səəuno” uwQ88ƏI 10 

UBy} əzouı Surqj3roA^ ejngoered Aue ‘ptrey rəuljo ƏY} UD 


*9IOUI IO səouno 6p 


peusroA 3t Uym 941329]9p-uou SEM ojnqgovded oq FEY} 


3oefo1 /ydaooe 


stsayjodAy əv) pinon əm yey} o3nqgov.red 
dATJOIJop—UOU € oy Tensnun os paraprsuod aq OSTE prnom 
91our IO səəuno 6p JO S3USTƏA JPL JO Surjsrsuoo ude:3 


eu JO 3uötr vey ən) 1e STYSteMm Jo dno:3 ən) ‘Arres 


" (ƏAT)əəyəp-uou) ərqeşins sem aynyoered ou VEY} 


Tensnun /rensn 


stsoyjodAy am 39e[e1 prnoA əm yey} os 
əq o) Seat 10 soouno Jp SuruöTəA oynyoered Aue roprsuoo 
pinom əm ‘Əmma uopspoep snoraeud om o) Surpro2oy 


Lu Seouno zy utu) sse[ SMITOM [[?,, 3ur4es 

ueujlougjei ,SS9[ I0 SooUunO ` JO S1u31ƏA TV 

Jo Surjsrsuoo se sjuStoA Jo d'no sty} eqrrosop pmo? 9M) 
*Sseouno . 180) 8Se[ SIYStom [Te Sjueseadez ,joalez,, 
pepeqe[ 491 Tez ey} uo dnoz:3 ayy, -sdno1z3 ç our S3U31ƏA 
aTqrssod əu1 SurptAIp "satt TEƏTMTƏA uA€IDp JAVY IM 


(pənunuoo) 


“681 


“881 


"A8T 


"981 
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92. 


(Continued) 


The values of this running-time variable were 
represented by numerals such as "20 seconds," 
"10 seconds," "800 seconds," Therefore, we 


refer to running time as a numerical 
would/ would not 


variable. 


Variables whose values are not represented by 
numerals will be called non-numerical variables. If 
we were interested in hair color, therefore, the values 
we might observe would be black, red, brown, and so 
forth. Since these values are not represented by 
numerals, we will refer to hair color as a 


variable. 


numerical/ non-numerical 


Political party is a variable, and particular values of 
this variable are Democrat, Republican, Socialist, and 


Soon. This would be an example of a 
variable. 


numerical/ non-numerical 
The age of American presidents when they were elected 


numerical/non-numerical 


to office would normally be a 


variable. 


We said earlier that the data from a study was a record 
of the observed values of the variables being studied. 
If these observed values are represented by numerals, 


numerical/ non-numerical 


variable. We would refer to the data, therefore, as 


the variable under study is a 


numerical data. 
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93. Records of the observed running times in the 
experiment in which we observed the time it took 
for a rat to reach some food would be 
data. numerical 


numerical/non-numerical 


94. Earlier, we considered a study in which people were 
asked which of three television channels they preferred: 
5, 7, or 9. We could name the three possible answers 
they gave as "5," "7," or"9." The names "5j "7," 
and "9" are used simply as names for their answers. 
However, "5," "7," and"9" are also used as names 
for the values of the variable we call number and are 
therefore called numerals. Even though the variable 
we call"their answer" is not really a number, its 


values are represented by * numerals 


numerals/numbers 


95. Because we said that any variable whose values could 
be represented by numerals would be called a 
numerical variable, we would say that the person" s 
answer in the television survey was a 
variable, even though the numerical 
numerical/ non-numerical 
numerals we use as names for the values of this 


variable represent different numbers in this do not 
do/ do not 


particular case. 


96. Because the list of the observed values of the answer 
variable would be a list of numerals, it would be an 
example of numerical d 5 data 
97. A list of the possible values of answers in the television 
survey be called numerical data however. would not 


would/would not 
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98. Such a list would not be called numerical data because, 
although it is a list of values represented by numerals 


(which makes it a numerical list), it is not a list of 
observed values. Only a list of values observed 
is referred to as data, 


99. Earlier, we listed how many pins a person knocked 
down each time he rolled a bowling ball. Each time he 
rolled the ball he could knock down anywhere from 
0 to 10 pins. Thus, any number between 0 and 10 was 
a(n) value of the variable" pins a possible 


possible/ observed 


knocked down." 


100. Each value of the variable "pins knocked down" was 
represented by a numeral. Therefore, "pins knocked 


down" is an example of a numerical 


numerical/ non-numerical 


variable. 


The numerals 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 are the 
values of that variable. possible 


101. The list 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 would not 
would/ would not 
be considered data because it is simply a list of the 
values of the data. possible 


possible/ observed 


102. The list of observed values of the variable used in the 
earlier illustration was 0, 0, 2, 10, and 5. This list 
be considered data. would 


would/ would not 
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103. 


104. 


105. 


106. 


This list of observed values be would 


"would/ would not 
considered numerical data, since each value of the 
variable (the number of pins knocked down) is 
represented by a z numeral 


One way in which men differ from one another is 
whether or not they have a beard. Suppose you were 
interested in how many men wore beards. If you went 
to a busy intersection and made a list of whether or not 
each man who passed wore a beard, your list might look 
like this after 5 people had passed: 


PERSON BEARD 
1 yes 
2 no 
3 yes 
4 no 
5 yes 


Since this list is a record of your observations, it 
be referred to as data. could 


could/ could not 


Here we simply recorded the values of the variable 

We were studying as" ` " or"___" rather than yes, no 
writing out "He wore a beard" or "He didn't wear a 

beard." 
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107. 


108. 


109. 


110. 


When we recorded the observations of whether or not a 
man wore a beard, we could have used the numerals 
"1" or "0" instead of "yes" or "no." We could have 
recorded a "1" for each man who had a beard and a 
"0" for each man who didn't have a beard. Thus, the 
list of observations we just presented would look like 
this: 

PERSON BEARD 


1 1 
2 0 
3 1 
4 0 
5 (?) 


Since the fifth person we saw had a beard, we would 
complete our list by replacing the question mark with 


a . 


1/0 


Because we represented the values of the variable we 
are studying with numerals, our data is 


numerical/non-numerical 


Earlier, we stated that is the 


number/numeral 


characteristic of a group or collection of things which 
we determine by counting how many things are in the 
collection. 


The numerals "1" and "0" are usually used as names 
for particular numbers. However, when we use these 
numerals as names for the two values of the variable 


"bearded or beardless," they represent 


do/ do not 


the characteristic we call number. 
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117. 


Thus, we have just seen a case where our data was 
numerical but the numerals didn' t have anything to 
do with the characteristic we call number. The 

were simply used as names for the 


numerals/numbers 


different values of the variable we were studying: "1" 
if he had a beard or "0" if he didn't. 


When we talk about number, we can say such things as 
"three people are a greater number of people than 


two/ten 


people. " 


We can say the number represented by the numeral "4" 
is twice as large as the number represented by the 


numeral" A 


We can also say that since 7 minus 5 equals 2, and 
3 minus 1 equals 2, the difference between 7 and 5 is 
the same as the difference between 3 and 3 


Sometimes when numerals are used to represent 
variables other than number, we can make similar 
Statements. For example, it would make sense to say 
that a temperature of 100 degrees above zero was 


greater (or hotter) than a temperature of degrees 


50/150 
above zero. 


It would also make sense to say that "4" pounds was 


twice as heavy as" " pounds. 


It would also make sense to say that the difference in 
height between a person who was 5 i feet tall and one 
who was 6 feet tall was the same as the difference 
between someone who was 4 $ feet tall and someone who 


was feet tall, since the difference in height was 
5/7 


1/2 foot in both cases. Ge 


numerals 
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123. 


We have seen cases, however, where numerals have been 
used to represent values of a variable so that statements 
of this sort are not appropriate. For example, when we 
considered asking people to choose their favorite from 
among three television channels, we represented the values 
of the answer variable with the numerals "5," "7, " and 
nəzər make much sense to say that 


would/would not 


choosing Channel 7 was greater than choosing Channel 5. 


Also, it be appropriate to say that the 


would/would not 
difference between answering "5" and answering "7" was 
the same as the difference between answering "7" and 


answering "9" because 7 - 5 = 9 - 7. 


You should NOT assume that just because a variable is 
numerical it is similar to "number" in some vay. It 


possible for a variable to be numerical and have 
is/is not 
nothing more in common with "number" than the use of 
numerals as names for its values. 


A variable is numerical simply because we use 


as names for values of the variable. 


Later, we will consider in greater detail the use of 
numerals to represent variables other than number. For 
the present, we only wish to emphasize that statements like 
"greater than, " "the same difference as, " and "twice as 
much as" appropriate or make 


are always/may not be 


sense — even when a variable is represented numerically. 


Many variables of interest to a scientist are similar in 
some way to the variable we call "number." For example, 
the variable called "length" is similar to "number, " so it 
would make sense to say something ten inches long is 


longer than something inches long. 
5/12 
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125. 


We determine the number of a collection of objects by 
counting them. We determine the length of something 
in a variety of ways, although the most common 
procedure is to use a ruler. Both counting and the use 
of a ruler are procedures by which we determine the 
appropriate numeral to represent a value of a particular 
variable. Procedures of this sort are called 
measurement procedures. Another example of a 

m procedure would be the use of 


Scales to determine a person' s weight. 


` The numeral 10 represents a number 


larger/ smaller 


than does the numeral 5, and ten ounces represents a 
weight than five ounces. Therefore, it 


larger/ smaller 


is possible to make statements about values of the 
variable "weight" similiar to the statement you made 
about the variable "number." For example, you could 
say the number 20 is twice as large as the number 10, 
just as a weight of 20 pounds is as 
heavy as a weight of 10 pounds. 


On the other hand, some variables are not at all 
similar to the variable called "number," even though 
you could use numerals to represent values of these 
variables. For example, the fact that the license 
number on your automobile is larger than the license 
number on your neighbor! s automobile probably 


indicate any difference between the two 
does/ does not 


automobiles (although it might indicate you obtained 
your license number on an earlier date than did your 
neighbor). 
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126. 


127. 


Whenever a variable is similar to the variable we call 
"number," it is possible to assign numerals to represent 
values of that variable in such a way that you maintain 
the similarity between that variable and the variable 
"number." Determining how similar a particular 
variable is to the variable we call "number, " and 
assigning numerals in the appropriate manner is called 
measurement. We will not have time to consider the 
topic of measurement in this program. The major point 
we wish to emphasize is that you should be careful not 
to suppose a particular variable is similar to " number" 
just because that particular variable is numerical. 
Numerals are often used to represent values of a 
variable simply because numerals are convenient or 
familiar names. Just because one value of a variable 

is represented by the numeral 8 and another value of 
the variable is represented by the numeral 4 


necessarily imply that one value is in 
does/ does not 


any sense larger or greater than the other one. 


In this section, we have distinguished between "numeral" 
and "number" in order to indicate how easy it is to 
convert a non-numerical variable into a numerical one 

by simply substituting numerals for the non-numerical 
names of the values. On the other hand, in most of 

your reading you will find no distinction is made between 
"number" and "numeral." Therefore, throughout the 
remainder of this program we will use the word "number" 
without attempting to distinguish between "number" and 
"numeral" Remember, however, that using "numerals" 
to represent the values of a variable does not insure any 
similarity between that variable and the variable called 
"number." The procedure of deciding how similar a 
variable is to number and assigning numerals in an 
appropriate fashion to represent this similarity is called 


m . 
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128. 


129. 


It is often useful to list things underneath one another, 
in what is called a column. The following list of 
numerals is arranged in a column. 

8 


6 
5 
2 
9 


The following list of colors is also arranged in a 


c e column 
red 
green 
blue 
green 


Another way of listing things is side by side, in what is 
called a row. The same list of numerals we just 
arranged in a column could also be arranged in a row, 


as follows: 


8, 6, 5, 2, 9 


The same list of colors we just arranged in a column 
could be arranged in the following e row 


row/ column 


red, green, blue, green 
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130. Since a column is a list of things one underneath the 
other and a row is a list of things arranged side by 
Side, these numerals are listed in a and row 
row/ column 


2158/29, 26;.-3 


these days are listed in a S column 
Trow/column 
Monday 
Tuesday 
Wednesday 
Thursday 
Friday 


131. Earlier, we considered an experiment in which we 
observed how fast a rat ran down an alley to reach food 
on ten successive days. The following data was 
presented as an example of what we might have observed: 


iə) 
» 
< 


RUNNING TIME 


200 sec. 
100 sec. 
150 sec. 
80 sec. 
40 sec. 
41 sec. 
15 sec. 
10 sec. 


0 


4 sec. 


3 sec. 


= 
° 


The different observed values of the running time 


variable were listed in a S column 


row/ column 
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132. We can also list things in a pair of columns placed 
Side by side. For example, consider the following: 


COLUMN COLUMN 
ONE TWO 
red brown 
green orange 
blue pink 


Thus, the colors appearing in the first column are red, 
green, and blue and those appearing in the second 


column are and - brown, orange, 
pink 


133. We could also think of the same arrangement of colors 
as being made up of three rows. For example: 


ROW ONE 


ROW THREE 


Thus, the colors appearing in the first row are red and 
brown, while the colors appearing in the row third 
are blue and pink. 


134. A table is an arrangement of things in rows and columns. 
The arrangement of colors we just considered is a 


table because it was formed by rows and 3, 2 
2/3 273 


columns. 
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135. 


136. 


137. 


The table of numerals shown below consists of 
2/3 


columns and rows. 


Notice in the table of numerals that the numeral 8 is 
located in the first row AND the first column. The 


numeral 6 is also located in the first column but in the 


row. 


first/ second 


Every numeral in the table falls in a particular 
combination of row and column. For example, the 
numeral is located at the intersection of the 


third column and the first row. 


The numeral 5 is located in the row and 


the column, 
Notice that the numeral 4 is located in the first row, 
second column and also in the row, 


column, 


In each of the tables we have seen, "row one" was the 


row on the and "column one" was the 
top/ bottom 
column on the . This is the way we will 


left/ right 


always number the rows and columns in a table. 
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second 


second 
second 


second 
third 


top 
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138. 


139. 


It is often useful to present or record data in the form 
ofatable Suppose you were interested in how a 
person who took a three-day trip spent his money for 
food. You could record your observations of this 
"food cost" variable in a table like the one that follows. 


THURSDAY SATURDAY 


$1. 00 $1. 15 
| &« | s | 


Breakfast $1. 50 


Dinner $5. 00 


Notice that instead of numbering rows and columns, we 


have identified the with the various meals 
rows/ columns 
of the day, while the are identified with 


rows/ columns 


particular days of the trip. The names of the meals 
and the names of the days identify the names of the data 
in the columns and the rows. They are not themselves 
part of the data, since you have written 


could/ could not 


these row and column headings before you actually made 


the observations. 


The row and column headings identify the data; they 
part of the data. Since "breakfast" is the 


are/ are not . 


row heading of the first row and "Friday" is the column 
heading of the second column, the entry in the first row 
and second column of the data ($1. 15) is the amount 


spent for on ° 
breakfast/ lunch Thursday/ Friday 


The amount spent for dinner on Friday was $ ; 
while the amount spent for on 
was $5. 00. 
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rows 


columns 


could 


are not 


breakfast, Frid: 


4.15 
dinner, Saturday 
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140. 


141. 


How much money was spent for all three meals on 
Saturday? $ 8. 00 


How much was spent for all three lunches? $ 7. 00 


The day he spent the most money on breakfast, lunch 
and dinner was . The most expensive Friday 


meal on each of the three days was ° dinner 


You may have heard of a psychology test called the 
Rorschach test, atest in which people are asked to 
report their impressions of ink blots, like the one 


Shown below: 


This ink blot was formed by spilling ink onto the center 
of a piece of paper and then folding the paper in half so 
that the ink is squeezed or blotted between the folds of 
the paper. When Dr. Rorschach invented this test, he 
believed the shape of the blot was so vague that a 
person's answers would reflect as much about the 
person as about the ink blot. Suppose you were 
developing a test of this sort. You might make four 
different ink blots and show all of them to ten people. 
You could ask each of the people to report whether 
they thought a particular ink blot gave them an 
impression of something that was "pleasant, " "neutral," 
or "unpleasant." Thus, their answer would bea 
variable whose three non-numerical 


nu merical/ non-numerical 


possible values were" s pleasant 
1 " and" si neutral, 
unpleasant 
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142. 


143. 


144. 


145. 


You could record the answers of the subjects in a table 
which might look like this: 


SUB- | INK BLOT 
JECT Á 


pleasant 
neutral 
neutral 
neutral 
pleasant 
unpleasant 
neutral 
pleasant 


1 
2 
3 
4 
5 
6 
7 
8 
9 


neutral 


= 
° 


pleasant 


INK BLOT 
B 


Notice that "INK BLOT A" isa 


"SUBJECT 1" isa 


pleasant 
pleasant 
pleasant 
pleasant 
pleasant 
unpleasant 
pleasant 
pleasant 
pleasant 
pleasant 


heading and 


pleasant neutral 
neutral neutral 
pleasant neutral 
unpleasant | neutral 
neutral pleasant 
unpleasant | unpleasant 
neutral neutral 
neutral neutral 
neutral neutral 
pleasant pleasant 
row/ column 
heading. 


row/ column 


The row and column headings 


the terms "pleasant," "neutral," "unpleasant" in the 


data, whereas 


are/ are not 


table data (since they are observed values 


are/ are not 


of the "answer" variable). 


The headings identify who made the answer, 


row/column 


while the 


headings identify the ink blot 
row/ column 


being shown when the answer was made. 
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146. All the answers in the first column of the table were 


made in response to ink blot whereas all A 
, A/D 
the answers in row 1 vere made by subject R 1 
10 
Subject 1 thought blot A was pleasant, blot B was 
pleasant, blot C was , and blot D was neutral 
Subject 4 thought blot B was pleasant 
unpleasant, but he thought blot was pleasant. D 
Subject 7 thought blot D was T pleasant 
147. Since the answers from all ten subject to ink blot A are 
in the first column, we could determine how many 
subject reported ink blot A was "pleasant" by counting 
all the occurrences of " " in the pleasant, first 
column. 
148. Four subjects reported ink blot A seemed "pleasant, " 
whereas subjects reported that ink blot B three 
seemed "pleasant." 
149. Subject 1 felt two of the blots were pleasant, whereas 
subject 2 saw only of the blots as pleasant. one 
150. What if you were told that one of your subjects was 
severely depressed and had been under treatment by a 
psychotherapist. You might expect someone who was 
very depressed or sad to find ink blots more 
more/less 
"unpleasant" than would a typical subject. 
151. You would compare different of the rows 


rows/columns 
table in order to compare how different subjects 


reacted to the ink blots. 
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152. 


153. 


154. 


155. 


If you were looking for a subject who seemed to find 

an unusual number of the blots unpleasant, you would 

probably pick subject since this subject found 
of the blots unpleasant. 


There might be something about particular ink blots 
which really does make them seem more or less 
pleasant. 1f you had to pick out the ink blot which 
might really appear more pleasant than the rest, you 
would probably pick ink blot ,since more subjects 
found this ink blot pleasant in terms of the other three 
ink blots. 


Notice in the table of data from the ink blot study 

how the column headings refer to particular ink blots. 
We could think of each of these four ink blots as a 
particular value of a variable, which we might call the 
"ink-blot" variable. Thus, each column of the table 
would be identified with a particular of the 
"ink-blot" variable. 


Similarly, each row is identified with a particular 
subject. We could think of the subject number as a 
particular value of a variable we might call the 
"subject" variable. Thus, a particular row would be 
identified with a particular of the 
"subject" variable. 
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156. When we draw the table, we might include the 
name of the variable whose particular values are 
associated with each row. For example, consider the 
following table: 


In this table, the ink blots can be thought of as a 
variable, whose particular values are A, B, ; C 
and . D 


Each value of the ink-blot variable is identified by a 
particular column 


column/ row 


157. You could also think of the subject to whom the ink 
blot was presented as a variable. Therefore, each 
particular subject is a particular of that value 
variable. The word "subject" in the table we have just 
seen refers to a variable, and the numerals 1, 2, 3, and 


4 identify particular of that variable. values 
, 158. The table is useful in organizing the data in terms of 
the subject variable and the ink-blot variable. The 
are associated with particular ink blots columns 
rows/ columns 
and the are associated with the rows 


rows/ columns 


particular subjects. 
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159. 


160. 


161. 


If you were interested in comparing one subject! s 
responses with another subject’ s responses, you would 
compare . Onthe other hand, if you 


rows/ columns 


were interested in comparing the responses to a 
particular ink blot with the responses to another ink 


rows/ columns 


blot, you would compare 


For example, in the table we just saw, it is a simple 
matter to determine which ink blot was found 
unpleasant by the most subjects. Since you are 
comparing different ink blots, you would compare 
different . The column that contains 
the most "unpleasant" responses is . On the 
other hand, to find which subject made the most 
"neutral' responses, you would compare different 

+ You would find that subject `  — 


rows/ columns 


was the one who had made the most " neutral" responses. 


Suppose you were interested in comparing the grades 
made by four students in three different courses. We 
could arrange their grades in a table, as follows: 


COURSE 


DATA ON GRADES IN VARIOUS COURSES 


Notice the rows are identified by the initials of 


particular whereas the columns are 
subj ects/ courses 
identified with particular : j 
subjects/ courses 
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162. 


163. 


164. 


165. 


166. 


The word "course" is written at the top of the table 

and might be thought of as the name of a variable 

whose particular values correspond to the particular 
in the table. You could think of 


columns/ rows 


uv) IE usa E item a oce and 
as particular values of the " course" variable. 


We identify each student by an initial Thus, if we 
think of students as a variable, the particular values of 


that variable are , ; , and 


You would find the course in which the students did most 


poorly by comparing different . You would 


columns/rows 
find the student who had done the best in his courses by 


comparing different . 


rows/ columns 


In the preceding table, the variable we are interested 

in is "grade." The different values of that variable are 

A, B, C, and D. The data presented in this table are 
observed values of the grade variable. 


12/4 


Earlier we considered a way in which you might keep 
track of your improvement as you took bowling lessons. 
After each lesson, you could bowl 5 balls and record 
the number of pins you knock down with each ball For 
the purposes of this test, you always set up any pins 
you knock down before you bowl the next ball. There- 
fore, the fewest pins you could knock down with any one 
ballare _ —— „while the greatest number of pins you 


could knock down with one ball is A 
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columns 


Math, English, Gym 


AM, LT., AC, 
RK. 
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167. 


168. 


169. 


Suppose you took 4 lessons and bowled 5 of these 
test balls after each lesson. Imagine you recorded 
the data from these tests in the following table. 


DATA FROM TESTS OF IMPROVEMENT 


Since your data are observed values of the " pins- 
knocked-dowm' variable, each value is identified with 
a particular ball you rolled following a particular 
lesson. In this table, the 5 test balls you rolled are 


associated with 5 different while the 
rows/columns 


4 lessons that you took are associated with 4 particular 


rows/ columns 


The total number of pins knocked down following the 


first lesson was ğ 
6/10 


The total number of pins knocked down following the 


second lesson was . The most pins were 


knocked down following lesson , Since there 
were pins knocked down with the 5 balls rolled 


on that day. 
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170. 


TL 


172. 


173. 


We have now looked at several tables and can 
summarize certain things about tables. First of 

all, the definition of a table is "any arrangement of 
things in and ." When the 
things you are arranging in rows and columns are data, 
the row and column headings help you to identify each 
particular piece of data, but they part of the 


are/are not 


data. 


It is often useful to think of the row and column headings 


as particular values of a 5 


One of the first things you should learn to look at when 
you see a table of data are the row and column headings. 
These headings identify where the observed values 


that make up the were obtained. 


For example, the table shown below contains data from 
two different subjects: subject ' and subject 
These two subjects were observed on four different 
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rows, columns 


are not 


variable 


data 


A, B 


days 
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174. The data in the table below is different from 
the same as/ different from 
the preceding table, since the data below was obtained 
from four on two different ” subiects, days 


: DAYS 
SUBJECTS 


5500 
mz 
Roue | 
Er aa 
mien 


This illustrates how carefully you should examine the 


row and column headings on a table. Many errors in 
interpreting data can be traced to a simple failure to 
pay sufficient attention to the headings on the table. 


175. You have seen that one way of presenting or recording 
a list of values for some variable is to arrange the 


names of these values in and rows, columns 


to form a table. 


176. Next, we will consider a way of representing values of 
a variable other than simply arranging the names of 
the values in rows and columns to form a ` table 


177. If you represented the numeral 1 with a square, you 
could represent the numeral 2 by putting another square 
on top of it Thus, drawing shown below would A 
B 
represent numeral 1, whereas drawing would B 
: AUS 


represent numeral 2. 
A 
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178. 


179. 


By adding a third square on top, you could represent 
the numeral 3. In fact, you could keep adding 

squares, one on top of the other, for each new numeral 
you wished to represent. For example, of the two 
pictures shown below, Picture A would represent the 
numeral 3, whereas Picture B would represent the 
numeral . 


A B 


You can think of these squares placed atop each other 

as forming columns, and you could place these columns 
side by side as shown below. Each column represents a 
numeral in the same way as the columns of squares you 
just considered. In the picture shown below, we have 
identified each column with a letter placed directly 
underneath the column. Column A represents the 
numeral" 3' since this column is three squares high. 
Column B represents the numeral" ^ " since this 
column is four squares high, and column C represents 


the numeral" 5 
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179. (Continued) 


The width of each column is the same, but the height 
differs depending upon which numeral the column 
represents. In other words, the of the height 


width/ height 


column determines which numeral it represents. 


180. The same three columns are shown below. Now, how- 
ever, we have drawn a line with marks on it to represent 
the different possible heights of the column. For 
example, the mark next to the numeral 1 is at the 
same height as a column one square high. The mark 
next to the numeral 2 is at the same height as a column 
two squares high, and the mark next to the numeral 3 
is as high as a column ` squares high. 3 


a 


181. Notice Column A is 3 squares high. This is indicated 
by the fact that it is the same height as the mark next 


to the numeral ü 
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182. 


183. 


184. 


Although Column B is not touching the line with the 

marks on it, you can see that Column B is 4 Squares 

high and, therefore, at the same height as the mark 

next to which the numeral yis written. 4 


The mark next to the numeral2 is at the same height 


as the top of Column since that column is two E 
A B/C 


squares high. 


Suppose you were tossing a die (one of two dice). You 
might be interested in the number of dots shown on its 
top when it came to rest. For example, the following 
die has dots showing on its top surface. 2 


The number of dots appearing on the top of the die after 

each new toss be a variable since this would 
"would/ would not 

number could change or vary from toss to toss. The 

possible values of this variable could be represented 


by the numerals" 5, mm 4, " əə mr 3, mm 1, " and" Ball 6 


We could represent each possible number of dots 
showing on the top of the die by a column of squares. 
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186. 


187. 


(Continued) 


For example, we could represent the die falling so 


that 3 dots were on top by column shown below: 
A/B 


4 
3 


A B 


Since the largest number of dots we could observe on 

the top of the die would be , the highest possible 
column of squares we would need would be — squares 
high. 


Since the fewest dots we could observe on the top of 
the die would be a single dot, we could represent this 


outcome by a column ` square high. 
Suppose you tossed a die four times and recorded the 


number of dots showing after each toss, your results 
might be like those shown in the following table. 
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189. 


190. 


(Continued) 


On the first toss, there were 5 dots Showing, on the 
second toss there were 3 dots Showing, on the third 
toss there were — dots showing, and on the fourth 6 
toss there was ` dot showing. 1 


Notice how the data in the previous table are represented 
by arai of numerals indicating the different column 


column/row 


observed values of the variable. 


The column heading "dots showing" can be thought of 
as the name of the variable we are observing and the 
numerals in that column as the observed 


observed/possible 


values of that variable. 


Notice that we have used numerals to represent the 
different observed values of the variable. These 
numerals are simply names for the values. Another 
way of representing these values is with columns of 
squares that form a sort of picture of the data. 


Four columns are shown on the following page. Each 
column represents one of the observed values in the 
previous table. We have identified the particular toss 
represented by each column by writing its name 


directly below that column. 
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(Continued) 


Im SS is 6 C. 
he column for the to: 1 blocks high since 


6 dots were showing on the toss. 
6 


5 


DOTS 
SHOWING 


d wa 3-4 
TOSS 


Since the line next to the column on the left has marks 
to indicate the different possible heights of the columns, 
we really don' t need the lines between each square in 


the column. 


We have redrawn the columns below, leaving out the 


lines separating each square in a particular column. 


6 


1227209 3 
TOSS 


We can still see that the third column represents the 
numeral 6 since the third column is the same height as 


the mark next to the numeral 
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192. 


193. 


194. 


À figure of this sort where the values of a variable are 

represented by the height of each column is a type of 

graph. Inthis graph, the value of each variable is 

represented by the of each column height 
rather than by a name or numeral, as in the previous 

table. 


When you roll a die, there will be a certain number of 

dots showing on its top face when the die comes to rest. 

For example, the die shown below has dots 6 
showing on its top face. 


Suppose you tossed a die four times and recorded the 


observations shown in the following table: 


According to this data, the fewest number of dots were 


third 
showing on the toss. 
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196. 


Two graphs are shown below. 
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4 
DOTS 
SHOWING 3 
1 
Ist 2nd 3rd 4th 
TOSS 
GRAPH A 
5 
4 
DOTS 
SHOWING 5 
1 
0 
Ist 2nd 3rd 4th 
TOSS 
GRAPH B 


In the previous table it was indicated that there were 

3 dots showing on the first toss. Both Graph A and 
Graph B indicate there vere —^ — dots showing on the 
first toss because the column representing the first 


toss in both graphs is squares high. 


Compare the data shown in each of the previous graphs 


with the data shown in the previous table. Graph 
A/B 


A/B 


agrees perfectly with the table, whereas graph 


does not. 
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198. 


(Continued) 


Both the table and Graph B indicate one dot was showing 
on the third toss, whereas Graph A indicates two 
dots were showing on the third toss. 


Notice how easy it is to compare the values in the two 
graphs because the values are shown as a picture, rather 
than as numerals in a table. Because it shows a picture 
of the values rather than simply listing their names, a 
is useful graph 


graph/ table 


Earlier, you considered an experiment in which we 
recorded the time it took a rat to run down an alley to 
reach food. The data for that experiment were observed 
values of the "running-tim€" variable. Let's consider 
how we could represent values of the running-time 
variable in the form of a graph. Four values we might 
have observed in that experiment are shown below in the 


graph/ table 


RUNNING TIME 

[2 | aseene 
TEE 
ban 


form of a 


10 Seconds 


53 


Ela 


wopuer cS ty) ER OOM 
ernpsdoid o[dures əv) yr uorjnqra3stp opdures [eoTJe.100y}3 

* ejepnopeo o) Á1oeu], Ajr[rqeqoxq esn ATUO p[noo nox 

"ərdures v Sururejqo ur Goen egnpeoodd oy} mouy 1, uop 

NOA şen) sr uorjnqrijstp 3urpdures [eorjedoauj £ eje[no[eo 


ULI no4 şou IO zəmşəqA I€9[O JOU SI 3I UOS¥aT əuQ 


“uonnqınsıp 
Sur[dures [eorəzoəu1 € eje[no[eo o) Azoəvqq Ajr[rqeqoad. 
aen prno? nod jou IO 1oqjouA juaurojejs Surpooa1d 


jou et /St 
jou st ay} DOIT 1eo[o jt fCuxouueqjng -onsnes 


uorjerndod am jo e3eurse ue se orsrejs ordures oun 

əsn nod uəuA aa prnous noÁ əəvəpiyuoə jo oo13op oun 
eururrojop 0} IƏPIO ur orsr1e1s e[dures sru1 Jo uorinqrrjsrp 
Suridures ay} 1oprsuoo o) quej1odurr əq prnoA 11 

ISLI OY} oxoA St H -uorjonijsur peurure130d Jo IOA?] 
ur 919A əldures V ut sjuopnjs au JO Spatu1-0A3 Jey} oq 
P[NOM şuəurəşe1s ou yo uorjejexdgejur o[qrssod zəqqouy 


*uorj10doid etdures v o) 10 uorjzodoad 
uoysindod ou o) srəyər ,sjuepnjs € jo MO Z, səməqu 


jou st /st 


SI “TƏAƏAOU ‘IEI 3I “S9Xə) 1e[nSo1 o) uorjonrjsut 
peurure1S01d 19jo3d Krjunoo əy} ur sjuopnjs o3o[[oo Tre 
JO Sp1TQ31-04 FEY} ST queuro3ejs STU} yo UOT}e}eI1d193UT əuo 


a" SYOOQ}X9} 
Ave[n3ei o) uorjongjsut peurure1804d 
39geud sjuepnjs əSərioə aam JO MO OMY, 


:3ueurejejs SUTMOTIOJ əv) 1oprsuoo “Idurexə 

JoJ -03eurnse ən) yo Áo&uno09 o[qeqoud əy} zəpisuoə 
03 JUe}IOdUIT eq [[I^ I “Sulpeər ano ur uory1odoad 
uorjerndod e jo əşeumsə ue 1ojunooue noÁ 1oAeuauA 


*68 


"88 


"18 


199. 


200. 


201. 


The longest observed running time was seconds. 
The shortest observed running time was seconds. 


The graph shown below represents the same data as the 
previous table. Notice that we have once again 
represented a value of the variable by the height of each 
column. The column representing Day 1 is the same 
height as the mark next to the numeral 30. The column 
representing Day 2 is the same height as the mark next 
to the numeral . At the left of the graph we 
have written what the numerals represent — that is, 
"running time! in seconds. Since the column for Day 1 
is next to a mark with the numeral 30, you know that 
the rat took ^ ^ seconds to run down the alley to 
reach the food on the first day. 


40 
RUNNING 30 
TIME IN 
20 
SEDONDS 
10 


On Day 4, the rat took 5 seconds to reach the food (as 
was indicated in the previous table) The column for 
Day 4 is only half as high as the mark next to the 

numeral 10. In other words, it represents a running 


speed of seconds. 
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202. À table and a graph are shown below. The figure 
marked À is the and the figure 


graph/ table 


marked B is the : 
graph/ table 


10 
BE 
BEBE ps es, 
BE is cee 
FIGURE A ] 
5 
SCORE 4 
3 
2 
1 
0 
şirəyə 
DAY 
FIGURE B 
203. The numerals along the side of the graph are different 
values of the variable. 
day" /" score" 
204. The largest observed " score" was on Day S 
10/15 2/3 


Notice how clearly this is shown by the graph: Day 


2/3 


has the highest column. 


table 


graph 


"score" 
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206. 


Consider the following graph and table. The "score" 


value shown for Day 1 the same on the graph 
is/is not 


as on the table. 10 


aI co € 


DAY 


A score of is shown for Day 2, both on the graph 


5/8 
and on the table, 


The observed value for Day 3 is indicated in the 


but not in the S 
graph/ table graph/ table 


To make the table identical with the graph, you should 


put a score of in place of the question mark shown 


in the table. 


It is clear that the scores were the same on Day 2 and 3 
because the columns are the same for 


both days on the graph. 
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201. 


208. 


Another graph and table are shown below. Fill in the 
table so that it is identical to the data shown on the 
graph. Now compare your work with the table given as 


the answer. 10 


SCORE 


0 


DAY 


The data in the following graph are values of a variable 
." A value of this variable 


named " 


was observed on each of three ğ 


$40 


57 


cost 


days 


uns 


000* 


ZER 
*e[qe3 snorae.1d əy} ut UMOYS dno.r3 əm ut 


pəpnrəur so[dures əsou) yo sənriqeqosd əm jo 
əm) o) Tenbə st dno13 Zog mo e[dures & Sururejqo 
Jo Arıqeqoud oy} *rer[ree pəzəptsuoo Aıoəm Ayıyıqeqosd 
JO era uoyyppa ordus ən) 03 3urproooy “sərdures yo 
sodA} 19470 rr? (z) pue “muəş-əuo uti) ərou ou Aq 10118 
ue sem qorg^ uorj1odoxd uorv[ndod eq jo ejeurse ue 
o) pea prnoA yey} so[dures (T) :sdno13 om} out Ssooo1d 
Surp[dures ay} Jo seuroojno e[qrssod ay} paptAtp IM esoddng 


srenbə 

mt mypuesnou? 3SəSoTə ay} 0) JJO pəpunor SEM anpeA 
Aırıqeqosd əv) ueuA yey} Juonbaayut os əzə, suorurdo 
e[qe1oAvj 01 10 “6 ‘g 3urureşuoə so[dures yey} əənoN 


jou st /SI 


suorutdo o[qe1oAvj JO uory1odoad əv) YOTYA ur 9soq IIE 


. ($ = a) uorj1odoad uorjepndod om se əures au 


(Axrtqeqoad 3seuSrq əy} yya sərdures əy} ‘-a-T) sərdures 
Jo sodA] Surrinooo Anuənbəqy 3sour əv) Fey} IION 

man se uoyyarndod əy} sezr1ojoe1euo nq exnpeoozd 
Sujy[dures uropuex ən) Duo şou jo şənpodd € sr uorjnqurstp 
sty} jo odeys iepjnorjred oup, :sseooid Surpdures wopues 
ay} Jo seuroo3no e[qrssod ay} Jo qoee o) Ajrrqeqodd 

€ suisse uonnqrnsip Ajrqeqoadd sty} "Spioun Iəulo uf 


srenbə siequinu əsəu1 jo ums ən) yey} uons 'Ajrrqeqoadd 
zo ‘zaquinu e pəu31sse st o[dures jo əd4 quoe -so[dures 
jo səd4) eqqrssod TI əm) A[durrs ere souioojno TT əsəuL 


-souro2jno T7 Jo Surjsrsuoo eoeds epdures e uo uoynqı3sıp 


AyTrqeqord e se pəpreğər oq ueo əlqu) Surpəoə:d 
əm ut UMOYS uornqrujsrp Surqdures peorje1oou IYL 


-qjuoj-ouo Uey} Zoo? eq p[noA əşeumsə INOÅ yo 10118 
ejn[osqe əy) 'suorurdo e[qe10Av] wey} əcoul 10 
uorurdo o[qe1o0Av] uu) SSO] pourejuoo o[dures əv) yi 


"vL 


"EL 


"eL 


"TL 


“OL 


208. 


209. 


210. 


211. 


(Continued) 


The largest cost was observed on Day , Since the 
column for that day is than any other 
column. 


To fill in the column and row headings in the following 
table so that it represented the same things as the graph 
we just considered, you would label the column 
containing the values with the word" " and 


the three rows with the numerals" ét Eé St 


" " 


Since each row represents a different on which 
a cost was observed, the square above the numbers 1, 
2, and 3 should have the word" " written in it. 


Fill in the table below so that it represents the same 


things as the graph does. 
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212. What if you had a score of zero and you wished to 
represent it in a graph. If a column three squares 
high represented a score of 3, and a column two 
Squares high represented a score of 2, and a column 1 


Square high represented a score of 1, then a column no 


Squares high would represent a score of » zero (0) 
213. Therefore, on the following graph a score of zero was 
recorded on Day 2 and Day . 4 
4 
3 
SCORE 2 
1 
0 
dc 20. EC 
DAY 
214. Notice the line on the left of the graph has a mark to 


indicate the height of each column and a numeral 
indicating the "score" value of each height. The 
numeral indicates that if a column had nür 


no height at all it would represent a score of 5 0 
215. Often you will see graphs in which the top of a column 


is at a height betw een two of the numerals or marks 


indicating values. 
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215. (Continued) 


For example, the following column would probably 
indicate the value because it is lower than 


20/30/40 
40 but higher than 20, and approximately half-way 
between their marks. 


40 
SCORE 20 
0 
DAY 
216 According to the following graph, the score on Day 1 
was and the score on Day 2 was . 
25/ 50 15/100 
100 
SCORE 50 
WS SWEET 
DAY 
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217. 


218. 


In the same way, you could represent the value 14 with 
a column one and one-half squares high. For example, 
Column A would represent a score of 14 and Column B 


would represent a score of because it is 
2/24% 
between the height of 2 squares and the height of 


3 squares. 


Suppose you were recording how much time it took a 
person to solve a puzzle. You might have collected the 
data shown in the table below: 


SUBJECT TIME TO SOLVE 
1 20 Seconds 


15 Seconds 
25 Seconds 


Thus, there is data from subjects, and each 
; 20/4 


observed value is a particular 7 
time/ subject 
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Section H: Distributions 


We have referred to records of the observed values of 
a variable as data. 


Xt is often useful to summarize or describe data, rather 
than simply listing all the observed which values 
make up the data. 


Suppose there were eight possible values of a variable 
and you observed only three of these possible values. 
Your data be summarized by the could not 


could/ could not 
statement: All the possible values of the variable were 


observed. 


A/B 
which might be summarized or characterized by the 
statement: The value of the variable was the same for 


Of the following two tables, Table contains data B 
A/B. 


all observations. 


Table B indicates the value of the " score" variable was 


seconds on each of the days. 20, 3 
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You could describe or characterize the data in 


Table oR above with the statement: A different value A 


was recorded each time the "grade" variable was observed. 


Suppose the grades shown in the previous tables were 
based on the usual grading system of A, B, C, D, and 
F, where A is the "best" grade and F is the "worst." 


You could describe the data shown in Table with B 
A/B 


the statement: An A was the best grade observed. 


You could describe the data in Table(s) Aand B 
A/B/A and B 


by saying the "worst" grade observed was a D. 


The reason we say these statements characterize or 

summarize the data is that they tell us something, but 

not everything, about the data. In the previous example, 

we could describe TE saying: Only two of B 


the possible values of the "grade" variable were 
observed. This statement describes how many of the 
possible values are represented in the data. The 


statement tellus which particular does not 
does/ does not 


values were observed. 
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9. 


10. 


Consider the table of data shown below. 


Breakfast 


Lunch 


Dinner 


The data consist of three values of a variable named 
"cost" /" meal" 


You could characterize (describe) the data shown in the 
preceding table with the statement: The largest observed 
value of the cost variable was $ ; 


Saying $4. 10 was the largest observed value, 


“does/ does not 


tell you which particular meal cost the most. 


If you were only interested in the smallest amount paid 
for any one meal, the statement "$1. 50 was the smallest 


amount paid for any one mean” describe 


would/ would not 


the data in sufficient detail for your interests. 


Therefore, if you were only interested in a particular 
characteristic of the data, a summarizing statement 
often answers your question most simply. However, a 


summarizing statement tell you as much 


does/ does not 


about the data as a complete list or table of data. 


64 


cost 


4. 10 


does not 


would 


does not 


STV 


I-N 
S XA c8 
KUN 
DERE Y 
a/v 
"e Sjuoso1dod v[nur10g Iy” 2 


a/V 
Sjuoso1doi Ern IO! 'se[nuriogy OM} Sur^O[[0] əv) JO 


b/s 
Aq og 3urprarp Aq f? Pars pmoa noÁ pue “ç Aq og 


Surprarp Áq „O puts prnom noA “g = N pue og =, - x) UH 
"ent, *pvajsut 1 - N Aq əptAIp noA Ce Əynduuoo 03 mq 
N Aq sf - x) 9pIAID no4 Ce 9jnduroo 0} “rəqurəuəşi 


ad 
ejepnopeo 
0} Į - N Aq uorjeraop pərenbs ou jo ums eu əptAID 
zz? 
nok svazaym ^ —  ” eje[no[eo 0} N Aq suorjeraop 
pəzenbs əy} jo wns ou optarp NOA yey} ur Ordre c? 
10} e[nULIOJ au WOT} SI9gJID e 10g SIDOT ay} "ent. 


= I-N 
“S - XX 
pue 
SNL. 
cé XX 
‘spiom Zom0 up (* 2° əşndunoə 03 prp nod se ardues əy} 
ur suorjeA1esqo jo requimu,, əm Aq 31 Surprarp Jo prost) 


(pənunuoo) 


"Sy 


RA 


"ER 


"ev 


— 


11. 


12. 


13. 


One of the first things you might ask about a collection 
of data is how often were particular values of the 
variable recorded. For example, the following table 
indicates a "score" of"8" was observed times 
out of the 5 observations making up the TERIS 


The score value 5 occurred the same number of times 


as the value 2, since they each occurred A 


You could characterize (describe) the data in the 
previous table by saying the value" " appeared 
three times and the values " " and" " each 


appeared once. 


This summary of the data be sufficient 


would/ would not 
if you were only interested in which score value 
occurred most frequently. The statement would tell 
you that the value" "gas observed more often 


than any other value. 


The previous summary statement tell 
does/ does not 

you enough about the data to determine on what 

particular day a score of "2" was observed. 


Whether or not a certain way of summarizing the data 
is suitable depend on what particular 


does/ does not 


characteristic of the data you are interested Ii 
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14. 


15. 


Suppose you asked ten people to judge whether a 
particular painting was "good" or "bad." You might 


obtain data of the sort shown in the following table. 


Judgement 


You could summarize this table of data by counting the 
number of times each of the possible values of the 
3 


"judgment" variable were observed. 


The two possible values of the variable named 


"judgment" are and 


According to the table of data, the value "good" was 


observed times and the value "bad' was observed 


times. 
Another way of saying the value " good" occurred 6 times 
is to say the frequency of "good" was 6. Thus, instead 


of saying the value "bad" occurred 4 times, you would 


say the frequency of "bad" was 


66 


good, bad 


pərenbs 


Io[[eurs 


Io0S8.v[ 


zərreuus 


uorjodo.id 
uorjerndod 


el? 


" ərdures əy} ur (N) suorjeA1esqo yo 1equinu əy} Aq 
peprarp ueour om woaz SUO;PlA-p 9g) Jo 
WNS au) A[durzs st oouerieA IY} 407 v[nuLIOJ IY} FEY} 930N 
pina: t ` STA «m 
qz x Te 
‘MOTO UMOYS v[nULIOJ əv) 
Aq uəAT5 st o[dures € yo oouerreA IY} FEY} [[£991 [TI^ nox 


xərreurs / 193.10] 


*eouerreA uone[ndod əy} uey} 
9q o) spuej ir e2ours 'eouerreA uorjerndod əy} jo o31veurtjso 
pesetq v aq o) pres ua3jo sr o[dure v yo oouerivA ou], 


Io[reurs Jaäret 


: 9q TITA 3t Gem UIO ITOM 9ouert1eA 
Zëtteg /zəğrel 
uoryerndod əy} uey} 9q TITA o2uerivA 


əldures əy} yey} sueour Adus 3] -oouerLIvA uonerndod əy} 
uey} 1o[[eurs oq TilA ejdures opgi ATƏAƏ Jo oouver1eA 
ən) yey} ueeur jou səop SYL :eouwpreA uopyerndod 

ONT} əm) ucu) zəlramıs mus 9q 03 puo3 prm erdures 

8 JO eouspreA əm yey} 'e[durexo 10] “mous aq ULI J 


*SOT}ST}e}s opdures 
Ire Jo onsrrejoereqo e YOU st uorj1odoad uoNendod əy} 
Tenbe 03 uorj1odoad ərdures oSexoAe əm) 107 Aouopuo aur 


əm) Tenbə 03 puə1 prnom 

suorj1odoad ejdures asou [Te Jo oSexoAv əm) “əydures 
qo*e qoy uorj1odo.id ən) pojernopeo pue uorjep[ndod sures 
ay} uro1j se[dures wopuer Avew pourejqo noA F ‘sprom 
Ze up :uorodoad uorjerndod əy} syenbə uorjrodoaud 
ərdures eS€u1oAs ou; 260) ST suonnqıısıp Surpdures 


(pənunuoo) 


` 88 


“LE 


*9€ 


16. 


17. 


18. 


19. 


20. 


21. 


If you were to say the frequency of a certain value 
was ten, you would simply mean that you counted how 
often that value had occurred in the data and found it 


had occurred times. 


If you said the frequency of a certain value was 30, you 
would mean that you had counted the number of times 
that value had occurred in the data and found it had 


occurred times. 


If you had a set of data in which a particular value 
occurred times, you would say the frequency of 


that value was 25. 


If 8, 6, 5, 2, 6, and 1 were a collection of data, you 


would say the frequency of the value 6 was and 
2/3 


the frequency of the value 8 was , since "6" 
occurred twice in the data, whereas "8" occurred only 


If a collection of data contained the values 8, 8, 2, 9, 
6, 6, and 5, the frequency of the value 8 would be 


and the frequency of the value six would be o 


To say a value has a frequency of zero means that 


value occurred in the data times. 


In a particular set ofdata, therefore, we might find 


the frequency of the value 20 was zero. This would 


mean the value 20 3 
never occurred/ occurred 20 times 


in the data. 
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23. 


24. 


25. 


26. 


Suppose a variable had three values which could be 
represented by the three letters A, B, and C. Suppose 
your data consisted of the following observed values: 
A, A, B, A B, B, A, and A, The frequency of the 
value Ais and the frequency of the value Bis . 


Since there were no observed values of the possible 
value C, the frequency of C is D 


If the data consisted of ten observations, we would have 
a list of ten observed values. If all ten of these values 
were the same, we could say the frequency of that 
value was Q 


If the data are 100 observations, it 


would/ would not 


be possible to have a frequency of some value which was 
greater than 100. 


Suppose you tossed a coin a hundred times and let it 
fall on one side or the other each time. The frequency 
of "heads" in your data could not possibly be greater 
than b 


The fewest number of times '' heads!" could occur would 
be none at all. Therefore, the smallest possible 


frequency of heads would be : 
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28. 


Whenever we count things, we obtaina number. Since 
we count the values in the data to find their frequency, 


each frequency anumber. One way of is 


‘is/is not 
summarizing or characterizing data is to count how 
often each of the possible values of the variable occur. 
You could determine a frequency of occurrence for 
each of the possible values of the variable. 


Consider the following table of data. 


"ess? 
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The variable represented in the data is named" — — " grade 
and a value of this variable was recorded for 10 
students. Since the grade A occurred 4 times, the 
frequency of the value Ais  - 4 
The frequency of B grades is , and the frequency 
of C grades is because no C' s were recorded. h 
D and F both occurred just once. Thus the frequency 

1 


of D is the same as the frequency of F and equals 
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31. 


32. 


33. 


We could summarize this frequency information about 
the data in the previous table as follows: 


Possible Values Frequency 
Of Grades 


A 
B 
C 
D 
F 


The data in this table is represented by the numbers 


4, 4, 0, 1, 1. These numbers represent 
do/ do not 


values of the" grade" variable; they do, however, 
represent the of times each possible 
value was observed (occurred in the data). 


Because grade A occurred four times, the numeral 
4 occurs in the same rov of the table as grade e 


Because grade B occurred four times, the numeral 
occurs in the same row as grade B. Since a 
grade of was never observed, a occurs 


next to that letter. 


The last two rows in the frequency column contain 1' s, 
since both grade and grade were 


observed only once. 


The table in Frame 27, which contained a grade 
student, is often referred to as a table of the 


means that we have not 
we have merely listed 


for each 
raw data. The word "raw" 
summarized the data in any Way; 


allthe observations. The data are represented in the 
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33. 


34. 


35. 


36. 


37. 


38. 


(Continued) 


table in Frame 29, by the frequency of each 
value. This table be considered a 


would/ would not 


table of raw data. 


I you added together the frequencies of the possible 
values shown in the table in Frame 29, you would find 
that the sum or total of these frequencies equal 5 


The total of all the frequencies in a frequency table 


equal to the total number of observations in 
is/ is not 


the corresponding table of raw data. 


This is what we would expect, of course, since each 
value in the table of raw data contributes to only one of 
the frequencies in the frequency table. For example, 
the four observed grades of A in the table of raw 

data were only counted when the frequency for grade 


was being determined. 


If a coin were tossed a hundred times and you were 

told the observed frequency of "heads" was ninety-nine, 
you would know that the frequency of "tails" was 
because the frequency of "heads" plus the frequency of 


"tails" must equal ` 


If your table of raw data contained 1, 000 observations 
and a particular value had a frequency of 1, 000, you 
would know all the other possible values of the 


variable had frequencies of * 
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42. 


43. 


Summarizing or characterizing data in the form 

of a frequency table is suitable if you are only 
interested in the of times each value 
occurred in the data. H you were interested in the 
Sequence or order in which each value was observed, a 


table of raw data be suitable for your 


would/ would not 


purposes. 


Whether a frequency table or a table of raw data is 


required depend upon what particular 
does/ does not 


aspect of the data you are interested in. Each of the 
frequencies in the frequency table is a kind of summary 
of your data obtained by counting how often each value 
occurred. We think of this frequency as a 


can/ cannot 


number that describes or summarizes the data. 


Any number or term that summarizes or describes a 


collection of data is called a statistic. Each of the 


frequencies, therefore, would be called a 3 


Frequencies are often called enumerative statistics, 
because the word enumerate means to count and because 
we the number of times a value occurs 
in the data in order to determine its frequency. Thus, 
we refer to frequencies as enumerative statistics 


because the word "enumerate" means to ğ 


Each of the frequencies in a frequency table is a 


number which summarizes the data. Each of these 


frequencies is called an statistic 


eS 
because the word " enumerate' means to count. 
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44. 


45. 


Suppose you were conducting the following experiment. 
You present a subject with one pair of tones and ask 
him which of the two tones appeared louder — the first 
tone or the second tone. Suppose you had presented 
the subject with ten pairs of tones and asked him make 
a response after each presentation. You might have 
recorded the ten responses in the following table of 


B 


Tone Pair Answer 


1 
2 
1 
1 
2 
1 
1 
2 
1 
1 


In this table, an answer of" 1" represents a response 
indicating that the first of the two tones was the louder, 


whereas the answer "2" represents a response 


= 
cO e o -10 oc ^ c DY 


indicating that the second of the two tones was the 
The data in this table could be called, there- 
data. 


louder. 
fore, 
numerical/ non-numerical 


The two possible values of the "answer" variable in 


the previous table are s "hands ", The number 


represented in this table is 5 


of observations 


73 


rav data 


numerical 


12 
10 
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46. The previous table of raw data could be summarized 
in the following table. frequency 


7 
3 


This frequency table contains the two enumerative 


statistics " "and" S" məs 


47. Since the total of the frequencies in the frequency table 
must equal the total number of observations in the table 
of raw data, we did not have to count the number of 
times answer 2 occurred if we knew how many times 
answer 1 had occurred. For example, if the data in 
the table of raw data had been different and the 
frequency of the answer 1 had been 6, we would have 
known immediately that the frequency of the answer 2 
was , since 10 minus 6 equals 4. 4 

48. We have referred to a table containing a list of each 
observed value as a table of raw data. A table listing 
the frequency of occurrance of each value is called 


a table. Írequency 
49. - Each frequency in a frequency table is the number of 
times a particular has been recorded in value 
the data. 
50. Since the frequency of each value is determined by 
number 


counting, each frequency isa 
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55. 


A statistic is a term or number which describes some 
characteristic of a collection of data. Thus, each 


frequency in a frequency table is a . 


We refer to the frequency of a particular value as an 
enumerative statistic because the word " enumerate’ 


means to " 


Because it lists the value of each observation, the 
following table could be called a table of 

data, In other words, the table presents a complete 
list of the data. 


Observations Values 


When the observed data contains only two different 


values, we be sure there are only two 


can/ cannot 


possible values of the variable. 


The following table could be used as a 


table for the previous table of raw data by filling in the 
e row of 


frequencies for each value in the appropriat 


column . 


“PAR 


Value 


s s 


75 


statistic 


count 


raw 


cannot 


frequency 


Sutqrriosop 


ISULI PIVELA 


apour 
"'uerpour Meow 


suonuodoud 


uonnqınsıp 


vor 


*€jep Jo uorjoe[[oo € 
Jo Aem [njosn € optAo1d Áəu) əəu18 'sorjSrvjs 
ƏADdiz989p sv 0} PIIIAJƏI uojjo ITE SOT)ST]?]S əsəur 


I9pJO ` A am se uons 

sonsmnejs Aq pəşuəsəddərz oq ued yorym *ÁAjITIqELI?A ST S? 
0} peiogoi uo3Jo ST uOT)nqLI]STD əy) JO INSHAPPLILYI 
STUL “pəsrədsip 10 mo peə:ds oi? son[eA J} 

yoram o) oo13op Əy} sr suorgnqrijstp JO a1njeoj 1iougyouy 


c SEU OU Putra or 0. . Ul em 
se YONS Ssonsrejs gr^ fouopuo) [e1juo9 aq) 1uəsə:dər 

0} əlqtssod st 3: Moy Uses JAVY NOX -uornnqrnsip 

ay} Jo Áouepue; ['é13ju69 ou) IO INTELA yeərd2) am ST 

sənreA JO uorjoe[[oo Aue yo sa1nje97 quej1oduit ə) yo ouo 


. d 10 sorouenbaij JATJEJƏI au JO surrə1 
ur IO SerouanboJj [enjoe ay} JO Sudə) ut og paqriosop 
ag ULI e[q?I1?A € JO UOT)nqEI]SID ƏY} MOY Uses JAVY no X 


"ərqerreA eyz In — — — petu pereo st eyep 
Jo uorjoe[[oo ? ur S1n920 o[qerieA € Jo enpeA ojqrssod 
uəsə yry” Y}IM serouenbaij ƏY} yo JST V “ə[durexə 10, 
*#1ep JO suorjoe[[oo əqrrəsəp 0j sonsrejs aen 03 ojqrssod 
SI 3I Moy AES noA urex2o4d sty} Jo suorjoos Á[1eo am up 


eouəzəjuT [VOTISTIVIS IIIA more 


55. 


56. 


57. 


(Continued) 


The frequency of the value A in the table in Frame 53 is 


. The frequency of the value B in the same 4 
table is . Notice how the total of the frequencies 
in the frequency table the total number of is 
is/ is not 


observations in the table of raw data. 


Suppose the variable represented in the table of raw 
data in Frame 53 had four possible values: A B, C, 


and D. Since there were observations of the no 
values C and D, the frequencies of C and D would both 
be $ 0 


A table of raw data and two frequency tables labeled 
Table A and Table B are shown below. The frequency 
table that corresponds to the table of raw data is 
Table ` 

“AB 


TABLE B 


1 
2 
3 
4 
5 
6 
ü 
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63. 


Notice that the value does not appear in the 


preceding table of raw data and that its frequency 


presented in Table B. 
is/is not 


Table A could not represent the data in the table of raw 
data because the value 10 is represented as having a 
frequency of 3, whereas it should have a frequency of 

, aS itdoes in Table B. Any value of a variable 
not observed (recorded) in the table of raw data has a 


frequency of S 


If there were 8 observations in the table of raw data, 

we would know that the frequency of the value 15 was A 
so long as we knew that the frequency of the value 5 

was four and that the frequency of value 10 was four. 


Every possible value of a variable has some frequency 
— whether or not it is recorded in the data — since a 
variable has a frequeney of if it is not recorded 


in the data. 


m summary, ve can saya table of raw data lists the 
of each observation, whereas a frequency 


table lists the of times each value 


occurred in the table of raw data. 


Each of the entries in a frequency table is a number 


and each of these numerals is called an 


statistic. 


The total (sum) of these enumerative statistics equals 


the total number of observations in the table of raw 


TT 
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frequency (number) 
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65. 


66. 


À collection of frequencies in a frequency table is 
called a frequency distribution of that variable. A 
distribution indicates, therefore, frequency 


how often each value of a variable occurred in the data. 


In other words, the collection of enumerative statistics 
in a frequency table indicating how often each value of 


a variable occurred is called a f frequency 
d 3 distribution 


Suppose you asked a subject to sort a collection of ten 
drawings into four boxes, each box labeled with one of 
the four adjectives excellent, good, fair, and poor. 
The subject would be distributing the ten drawings 
amoung the four possible judgments he could make 
concerning the merit of each drawing. I you recorded 
his performance, your table of raw data would contain 


observations of a variable called "judgment," 10 
10/4 


This variable has possible values. 4 


When the subject was finished distributing the ten 
drawings among the four boxes, each box would have 
some particular number of drawings init The number 
of drawings in each box could be thought of as the 
of each value of the "judgment" frequency 
variable. If the subject put five drawings in the box 
labeled "good" and five drawings in the box labeled 
"fair," the frequency of "good" judgments would be the 
same as the frequency of "fair" judgments, and would 


equal . 
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(Continued) 


The frequency of " excellent" judgments and of " poor" 
judgments would both equal E 


The numerals 5, 5, 0, and 0 be called 
could/ could not 


a collection of enumerative statistics describing the 


subject! s judgments. 


If be appropriate to say the numerals 
would/ would not 


5, 5, 0, and 0 in the frequency table define a frequency 
distribution. 


It is often useful to present enumerative statistics by 
means of a graph. We could represent the frequency 
of each value with the height of a column, just as we 
represented the of a particular 
observation by the height of a column in earlier graphs. 


Of the two types of graphs shown below, Graph 
A/B 


would be appropriate for showing the frequency 
distribution of a variable named "score." 
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If there were only six possible values of the "score" 
variable, you would know that 6 columns would be 
required in the frequency distribution graph in order 
to represent all six values. However, since there 
were only four observations, the largest possible 


frequency for any score would be 
4/6 


The following frequency distribution graph contains 
two columns. Score A has a frequency of and 
Score B has a frequency of (as indicated by the 
heights of column A and B respectively). 


20 
Frequency 10 
0 

A B 


Score 


The table of raw data for the previous frequency 
distribution graph would have contained a total of 


observations. 


20/30 


The number of observations represented by the 
following frequency distribution graph is , Since 
A occurred times, B occurred times 


and C occurred times. 


Frequency 5 
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You know that the most frequently occurring grade in 
the previous frequency distribution is C, because this 
grade has the column. 


You can think of each column in the frequency 
distribution graph as composed of a series of blocks — 
one block for each observation of that particular value. 
For example, consider the graph shown below. 


Frequency 


e = t c A 


A" XB 
Value 


This graph indicates a frequency of for Grade A 
and a frequency of for Grade B. Therefore, the 
column for A is four blocks high and the column for B 


is blocks high. 


Notice the number of blocks in column A forms à 
column twice as high as column B, indicating that the 


frequency of A is as great as the 


frequency of B. 


Enumerative or frequency statistics are very important 
in elections. You are undoubtedly familiar with 
interpreting the data from elections even if you had not 
thought of these data as statistics, To illustrate this 
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TT. 


(Continued) 


point, suppose that you were conducting an election in 
which there were three candidates: R. K., A. C. and 
R. A. We could think of each person' s vote as an 
observation having one of three possible values. These 
values would be a vote for 7 “or 


V/e could summarize the data from this election in a 
table such as the one shown below. 


Considering "vote" as a variable, this table vould be a 
frequency table indicating that the value "R.K." had a 
frequency of _ , the value "A, C." had a frequency 
of  , and the value "R. A." had a frequency OE 


The number of votes cast for candidate R, K. is 
represented by the number 25 and the number of votes 
cast for candidate A, C. by the number 75. Both of 
these numbers are en statistics. 
The group of three numbers, 25, 75, and 50, are the 
distribution of the 150 votes among the three 
candidates. In other words, these numbers define a 


frequency H 


82 


RK, AC. 
R.A, 


25, 
75, 50 


enumerative 


distribution 


L6E 


g Y 
CN TE 
y . əng ur OS ST 
ugy} WEY} u3nouuj umeap ƏAIn2 qjoours au jo edeus əy} 


a/v 
a o) zəsorə əre (Aoləq) ainsi ur uMous suumnyoə 
əm) Jo sdo} əm) Moy eorjou 'epdurexo 10q -suumgoo 
Jo 1equinu ey} pəsvərəur noÁ se 1oujoouirs ouroooq 
sey jrjnq 'eures əy} A[u3nox sr 'suuinjoo əy} jo sdo} əy} 
Aq pejeoppur se "uorjnqrrjsrp ƏY} jo ed'eus əv) yen) 92noN 


L 330913 
S3HONI $ ISHUVAN OL LHOISH 


SL 0L 99 09 SS 06 GS? 


AON3DnOHI 


*,931n3rqp ut umoys se 'sdnoi3 mou Z OWT 


sdnoi3 snorAeud ay} jo qoa pəptAIp Á[[enjoe aavy nox 


*Seqour 01 3so1eou 


ssər (24001 


aIOmT əu) o) DO pəpunor nod uəuA uey} sdno13 noA 
9AIS p[noA YOTYM 'seuour ç şsərvəu am o) s3u3təq oy} yo 
punoi p[noo noA 'ejep sures əy} 3ursn ‘puey iou3o ən) UQ 


“T&S 


"086 


78. 


79. 


80. 


81. 
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This frequency distribution could also be represented 
by a graph. The graph shown below 


would/ would not 


represent the frequency distribution shown in the 


previous frequency table. 


Frequency 25 


R.K. AC. RA 


Notice the total number of voters in the election could 


be determined simply by adding all the 


in the frequency table. 


It is clear, therefore, that people voted in the 


election. 


The winner of the election is indicated by the candidate 
who has the column in the frequency 
distribution graph. This was candidate named o 


who received votes. 


Suppose you were a psychologist interested in studying 


how accurately a person could throw darts at the dart 


board shown below: 


83 


would 


frequencies 
(enumerative 
statistics) 


150 


highest 
A. C. 
75 


96€ 


9 380391 
SSHONI 01 ISTUVAN HHL OL LHOISH 


(61-99) (69-99) (cc-97) 
“ÜL ua 409 a “08 “ 


u 


09 u a JE [89 pinou nod ‘sayout og 03 o1oA 11 ue səuour 


AONSHñDOS2LI 


*səuour 


09 0} 1eso[o o1aA Əy s,uosrəd v Jr “sp:oA 19470 up 
*Sseuour DI. SE pepregori səuour g, pue 99 uooAjoq 
asoy} pue 'seuour ,09,, Se pəpre3ər səuour gg pue 9G 
UuaoA^jeq asou 'seuour OG ıı SE papredea səuour cc pue 
9p uəəAşəq asou — sdnoi3 99.14} OJUT paptAtp oq pinoə 
səvəur c), pue gy ueoAjeq sju3reu eu jo [Te MOY pejeorput 
aaey əm 'oSed SurMO[[OJ əy} uo UMOYS a1n3rj əy} 

ur “əqdurexə 104 'səuour OT 1sa.1eou ay} 0) squ3reu Ə} 
Ho pəpunoz noA gorgA ur uorjnqrujsrp Aousnbaaj v exeur 
prnoo noÁ 'sjuepnis 000 ‘OT Jo S1u3teu oy} JAVY NOK JI 


əşərəsip (7070777777 ƏzəA foy Z S? way on A[[euriou 
aM “snonuyşuoə se pəAəTA oq ULI so[qerreA Augur ATM 

“ S1U3TƏU OM} gogo uooA^jeq SənT€A Əv) JO Aue qsinSunsip 0} 

Surjdurojje jou IIL ƏM əsneəəq ‘,F9,, 10 £9, Sen[eA au] 


(pənunuoo) 


"666 


"926 


82. 


83. 


(Continued) 


You might conduct the following experiment. The 
subject would be asked to stand ten feet away from the 
target and attempt to hit the innermost ring. Suppose 
you told him that he would be scored as follows: 3 

for the inner ring, 2 for the middle ring, 1 for the 
outer ring, and 0 for a complete miss. You would 
then ask him to toss the dart at the board twenty times, 
recording his score for each toss. 


In this experiment, the distance of the subject from the 


board would be a whereas the score constant 
constant/ variable 


the subject received for each toss: would be a 


° variable 
constant/ variable 
Suppose you obtained the data shown in the following 
table: 
SCORE SCORE 
T 0 1 
2 1 2 
3 1 2 
4 2 2 
5 1 3 
6 0 1 
7 2 3 
8 1 2 
9 3 3 
10 2 3 
This table indicates that the subject miss did 
did/ did not 
the target completely on his first toss. The first bulls- 
9th 


eye he made was on the toss. 
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(Continued) 


The subject completely missed the target 
times out of the twenty tosses. 


To form a frequency distribution, we must count the 
number of times each of the score 


variable occurred in the data. 


We already noted that the value 0 occurred only twice 
in the data. Therefore, the frequency of a score of 0 
would be ° 


The frequency of a score of 1 is . "The 
frequency of 2 is and the frequency of 3 is D 


This group of statistics defines 


a frequency of the variable called 


"score." 


Suppose the data for twenty tosses indicated the 
frequency of 1" s was 5 and the frequency of 2's was 15. 
The frequencies of both 0's and 3's must therefore be 


A graph of the is shown 
raw data/ frequency distribution 


below: 
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92. 


Graphing the raw data in this way reveals an 
interesting characteristic of the subject! s performance. 


In general the more tosses he took, the 


better / worse 


was his performance. 


His performance appears to have improved as he 
continued his tosses because the columns tend to be 
towards the right-hand side of the graph. 


higher/ lower 


A table and a are shown below. The 
table contains the frequency distribution we just 


considered. This same frequency distribution 
is/is not 


represented by the graph. 


SCORE FREQUENCY 
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The most frequently occurring score is immediately 
obvious, since the highest column in the frequency 


distribution graph is for a score of ; 


While the frequency distribution graph clearly indicates 


the frequency of each score value, it 
does/ does not 


indicate the gradual improvement in performance during 


the course of the experiment. 


A graph of would be the 
raw data/ frequency distribution 


best way to indicate how the subject’ s performance 
improved during the course of the experiment, whereas 


a graph of is the 


raw data/ frequency distribution 


simplest way of showing how the subject! s tosses were 


distributed among the different scores. 


In the previous experiment, one toss of the dart might 
actually be closer to the center of the target than 
another and yet receive the same score. For example, 


consider the target shown below: 


If the X marks labeled "A" and" B' represent two 

places where a dart could have hit the target, the point 

labeled would represent a more accurate toss 
A/ B 


of the dart. 
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(Continued) 


If all darts falling within the center circle of the 
target receive a score of 3, the dart hitting at Point A 
would receive a score of and the dart hitting 


at Point B would also receive a score of . 


Therefore, although one dart was actually thrown more 
accurately than another, it received the same score. 

In evaluating the subject! s performance, it is easier to 
group all tosses of the dart that hit within the center 
circle into the same category of accuracy. In a similar 
manner, we do not consider exactly how far away from 
the target a dart was if it missed the target completely, 


since all such darts received a score of 


Another way we might assign a score to each toss of 
a dart would be to measure exactly how far from 
the center of the target each dart struck. If we 


followed this procedure, there be four 


would/ would not 


possible values of the "score" variable. 


The number of possible values of the "score" variable 
would depend upon how precisely we wanted to measure 
the distance between the dart and the actual center of 


the target. 


If we measured this distance to the closest one-thousandth 


of an inch, there would be possible values 


more/fewer 


of the score variable than if we measured down to the 
closest one-hundredth of an inch The more precisely 
we measured the distance, the more possible values 


of the score variable there would be. 
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101. 


The more precisely we measured the exact distance 
between the center of the target and the place where 


the dart struck, the often would we find two 


more/ less 
tosses of exactly the same score. If we measured 
closely enough, in fact, we would probably find that the 
Subject never received exactly the same score on any 
two tosses. 


It is often useful to consider two observations that are 
really slightly different as having the same value. In 
other words, to group together observations that are 
sufficiently similar and to consider them as having the 
same value is often a useful procedure. The following 
illustration will indicate how we can group observations 


in this way. 


Suppose you were studying how long it took a person to 
solve a certain mathematical problem. 1f you made 
observations on twenty people, recording the time it 
took each person to solve the problem, your data might 
be represented by the following table: 


Time in Time in 
Subject Seconds Subject Seconds 
1 60 11 20 
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(Continued) 


Notice there a value of the time variable 
is/is not 


which occurs more than once in the table of data. (You 
might wish to insert a bookmark indicating this page 
Since we will refer to the table shown above in the next 
Several frames. ) 


Since every observed value of the data occurs once and 
only once, each of these observed values would have a 


frequency of d 


Notice that the highest value (the longest time) 
recorded was seconds, whereas the lowest 


value recorded was Seconds. 


Suppose we counted all the "times" of 50 seconds or 
faster. We would find that exactly five of the observed 
times fell into this group of times. This group of times 
would consist of the 32 seconds observed for Subject 7, 
the 16 seconds observed for Subject 9, the 20 seconds 
observed for Subject 11, the 15 seconds observed for 
Subject 18, and the seconds observed for 
Subject 5 


We could say the frequency of "times" between zero and 
50 seconds was five, since there are exactly five 
observed times that were less than or equal to 


seconds. 
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107. 


Suppose we counted the "times" between 51 seconds 
and 100 seconds (including times of 51 seconds or 
100 seconds) The time recorded for Subject 1 


be counted, since this time is between 


would/ would not 


51 seconds and 100 seconds. 


There are exactly observed times that fall 
between 51 and 100 seconds in the previous table of 
data. The next group of times that we will consider are 
those times between 101 and 150 seconds, including any 
that might be 101 or 150 seconds. Subject 2' s time of 
128 seconds belong to the group of times 
“does/ does not 
between 101 and 150 seconds. The other times which 
fall into this group are the time of 149 seconds recorded 
for Subject 8, the time of 110 seconds recorded for 
Subject 10, the time of seconds recorded for 
Subject , and the time of 116 seconds recorded for 


Subject 16. 


We could summarize these frequencies in the following 


frequency table of grouped data. 


TIME FREQUENCY 
5 


0-50 sec. 
51-100 sec. 10 
101-150 sec. 5 


Notice the three rows correspond to the three different 


groups of times. 
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107. (Continued) 


For instance, in the first row "0-50 seconds" indicates 
that we counted all the times between 0 and 50 seconds. 


The numeral 5 in the same row but in the frequency 


column indicates that the of observed frequency (number) 
times falling within this group was . five 
108. A frequency table of this sort is called a frequency table 
of grouped data because we have determined the 
frequency for of values, rather than groups 


the frequency for particular values. Below is a graph 
of the previous frequency table of grouped data: 


10 


Frequency 5 
EE 


0-50 51- 100 101- 150 


Time In Seconds 


This frequency distribution gives a reasonable picture 
of the distribution of times for the different subjects. 

It indicates clearly that some subjects had times less 
than 50 seconds, that about the same number had times 
greater than 100 seconds, and that the most typical 


times fell between and seconds. 51, 100 
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109. 


If you are interested in obtaining a rough picture of the 
frequency distribution, it is often useful to group the 
data in this way — that is, not distinguishing between 
values that fall within the same group. 


Just how much you can group your data and still obtain 
a sufficiently clear picture of the distribution depends 
on your particular purposes. For example, in the 
previous illustration in which subjects were throwing 
adartata target, it was sufficient to divide the board 
into three circles and assign one of three scores 
depending upon where each dart struck. Consider the 
two targets shown below. Target A is divided as the 
target was in the previous illustration. In Target B, 
however, more circles have been drawn, thereby 
dividing the target into narrower rings. 


TARGET A TARGET B 


If you assigned a score to each dart depending upon 
which ring it struck, there would be more possible 
Scores on Target . The only difference between 


Target A and Target B, however, is that you could 
distinguish the position of a dart more precisely on 
Target a 
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110. 


111. 


112. 


113. 


114. 


115. 


In a similar way, we might have divided the previous 
mathematical problem data into smaller groups. For 
example, we could have divided each of the groups of 

times into two smaller groups. Instead of considering 

all the times from 0 to 50 seconds, we would have 
considered all the times from 0 to 25 and from 26 to 50 
seconds. Instead of considering all the times from 51 to 100 
seconds as a single group, we would have formed two 
groups, 51 to 75 and 76 to 100. Finally, we could have 
divided the group of scores from to 101, 150 
seconds into the following two groups: 101 to 125 

seconds and 126 to 150 seconds. 


If you become a scientist, one of your chief jobs will be 
to make observations of variables. Records of these 


observations are called 5 data 


We have seen that one way of presenting data is to 


arrange it in rows and columns to form a . table 


A table listing every observed value of the variable is 


often referred to as a table of data. raw 


The number of times a particular value occurs in à 


table of raw data is referred to as the frequency 
of that value. 
You determine the frequency of each value can 


can/ cannot 


for both a numerical and a non-numerical variable. 


94 


ose 
" SƏSSO) UƏAƏS ut Speay OM} Sururejqo 


Io[[eurs /zə3rel 


Aaärer Jo Ayriqeqoad əy} st uey} SI Sasso} 
UƏAƏS ut speau £ 3ururejqo yo Ajr[Iqeqo1d eq FEY} 92noN 


NIOO Ulva V AO 


S3SSO.L NXASS NI SAVAH JO H38IAQN 
LO gv P Tec t Pare 


0 
m 
m 
r9 
> 
z 
z E 
d 
< 
- 


“UTO9 IE} t JO səsso1 UƏAƏS JO MO Spray ), 
pue 0 ueaA^jeq aoquinu Aue Sururejqo yo ÁAjrpqeqoad əy} 
əqrrəsəp o) uorjnqrrjsrp Surpdures [eorjo1ooug) e se pəsn 
Terurourq əq usə uonnqrnsıp 0”  SutAOollo] out " 661 


* Səuroəyno 
ƏTqISSOd om} ay} yo vəvə yo ÁAj[rqeqoad ayy mouy noA yt 
uonnqrrnsrp Surpdures [e9rja109qj ? se aen o) uonnqın)sip 
‘Terurourq 7000777777 q eyerudoadde əm) aururzejzap wea nok 
"Sam N asou jo qoea souroojno OM} Jo euo ur Summer 
soul} N pojeodai st ssooo1d opgi e 19A9uouA “961 


"Sam yo 1equinu years ATƏA € uree I9AO pue 

pəşsədər 4ƏAO e1inpoooid Suyjdures og] — o ƏA Tİ 
In220 o) pəşəədxə eq prnom əfdures jo əd4) ərqissod qəvə 

uorqu uru Aəuənbədy eArje[ai IY} sə1eorpur uornqristip 

Teənəsoəu) STUL -uopnqngs:p2updues me 
ejepno[eo oj 'Ajrrqeqoad yo səma əqduris Sursn “əlqıssod 


SI} 'ssoooid uropuei əv) jnoqe uorjeuriogur urej192 


(pənunuo2) "Lët 


116. 


117. 


118. 


119. 


120. 


121. 


Instead of presenting a complete list of all of the 
observed values in your data, it is often useful to 
summarize the data in some way. Any term or number 
summarizing data in this way is called a s . 


One way we can summarize data is to report the 
frequency of occurrence for each value in the table of 
raw data. Therefore, each of these frequencies would 


bea $ 


Since the frequency of a value is determined by counting 
(enumerating) the number of times it was observed, a 
frequency is often referred to as an 


statistic. 


Any value of the variable not observed (one that does 
not occur in the table of raw data) would have a 
frequency of . I the data consisted of 

20 observations, the greatest possible frequency of any 


value would be . 


We can summarize the contents of a table of raw data 
in a table listing the number of times each value 
occurred in the data. This summarizing table is often 


referred to as a table. 


The enumerative statistics presented ina frequency 


table indicate how our observations were distributed 
among the different possible of the 
variable we were studying. 
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122. 


123. 


124. 


125. 


126. 


The frequency distribution of a collection of data 
is the collection of frequencies for the values in that 
data. Therefore, the enumerative statistics in a 
frequency table indicate the 

of the data. 


If you were told that a coin had been tossed 100 times 
and had come up "leads" 75 times and "tails" 25 times, 
you know the frequency distribution 


would/ would not 


of the data. 


Suppose you tossed a coin 100 times and observed 50 
"heads' and 50 "tails." Instead of reporting the actual 
frequency of "heads" or "tails," you might describe the 
distribution of the variable by saying that one-half of the 
tosses were "heads" and that of the 


tosses were "tails." 


If you observed 500 "heads" and "500" tails in 1000 


tosses of a coin, you could say that one half 
also/ not 


of your observations were heads and one half of your 


observations were tails. 


When you say one half of the observations are "heads," 
you are not indicating the actual frequency of "heads" in 
the data. Instead, you are indicating what part of all 
the observations is" heads." For example, if I told 
you I tossed a coin some unknown number of times and that 
one half of the observations were "heads, " you 

know the actual frequency of heads; 


would would not 


you know, however, what part of the 


would/ would not 


data was made up of observations of "heads." 
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127. 


128. 


129. 


If you knew the frequency of a particular value was 10, 
you would not know whether observations of that value 
represented a large part of the data or a small part 

of the data. If a value had a frequency of 10, 
observations of that value would represent a larger 
part of your data if the data consisted of 


12/100 


observations rather than observations. 
12/100 


Instead of reporting the actual frequencies of each value 
in your data, it often is useful to indicate the relative 
frequency of each value. The relative frequency of a 
value indicates how often that value occurred in relation 
to the total number of observations. For example, if 
you said the value 10 occurred 30 times in the data, you 
would be indicating the actual or absolute frequency of 
the value 10. If you said one-third of the total number 
of observations had the value 10, you would be 


indicating the frequency of the value 10. 


absolute/ relative 


Adding together all of the frequencies in a frequency 
table indicates the total number of observations in the 
data. To find out what proportion or part of the total 
is represented by a particular frequency, you simply 
divide that frequency by the total number of 
observations. If your data consisted of 100 
observations and a particular value had a frequency of 
50, the proportion of your observations having the 
value 50 would equal 50 divided by ? which 


equals one-half. 
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130. If your data consisted of only 4 observations and a 
particular value had a frequency of 1, then the 
proportion of times that particular value occurred is 


divided by , or one-fourth. 1, 4 


131. If your data consisted of 1000 observations and you 
observed a particular value 250 times, the frequency 
of that value would represent one-fourth of the total 
number of observations, since 250 divided by 1000 
equals . one-fourth 

132. It is possible to represent a proportion in several 
different ways. For example, you could represent the 
proportion "one-half" either as a fraction, which would 
be written as > or as a decimal, which would be 
vritten as .5. Similarly, you could represent the 
proportion "one-tenth" as either a fraction, which 


would be written , orasa decimal, which 


` heje 
=° 


would be written t 


If your data consisted of 100 observations of a variable 

and 10 of those observations had the value 4, you could 

say the proportion of observations in your data having 

the value of 4 were (representing the 1007 ib 
proportion as a fraction) or (representing 1 

the proportion as a decimal). 


133. It is possible to convert a particular fraction into 
another form representing the same value. For 
example, 2 is the same as i. Similarly, you can 


write a decimal in several ways. For example, .2 


is the same as . .20 
.20/.02 
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133. 


134. 


135. 


(Continued) 


Statisticians often find it useful to represent a 


proportion in terms of hundredths. In these terms, 


one-fourth would equal Ki , whereas one-half would 


50 Mec 
equal 100* Similarly? would equal 


hundredths. 


If a certain proportion equals 25/100, we often say that 


it equals 25 percent. Similarly, if a proportion equals 


50/100, we can say it equals fifty percent. Representing 


a proportion as a percentage is simply a 


convenient way 


of comparing proportions by converting them all into 
hundredths. For example, you could compare one-half 


and one-fourth by saying one-half equals 


percent, whereas one-fourth equals 


50/25 


percent, 
25 


(since 1/2 = 50/100 and 1/4 = 25/100.) Similarly, 


the proportion . 25 represents 25/100 or 


percent. 


If you said the frequency of a particular value in a 


collection of data represented 50 percent of the data, 


you would know the frequency of that value represented 
hundredths of the data. Thus, if the data 


consisted of 100 observations, a value would have to 


have a frequency of 0. 


50 per cent of the observations, since 50 per cent is 


in order to represent 


equivalent to 50/100. A value observed on five out of 
ten observations also represents 50 percent of those 


10 observations since 5/10 equals 


hundredths. 
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136. 


137. 


138. 


I your data consisted of one thousand observations, of 
which 250 had the value 5, you could say that 

percent of your observations had the value 5. If 250 
out of one thousand observations had the value 5, then 
250/1000, or 25/100, of the observations would have 
the value 5. This is why you could say that 25 

p of the observations had the value 5. 


Instead of the word "percent', the symbol % is often 
written after a number. Thus, 25% means the same 


thing as 25 A 


A proportion written as a percent is often referred to 
as a percentage. Therefore, the following is a list 
of S. 


209, 30%, 5%, 8% 2% 


Remember, represents a proportion of 2o 


209/25 


While it is often useful to represent a frequency as a 
proportion or percentage, it is important to 

consider the actualfrequency. For example, suppose 
an automobile dealer told you 3/4 of the people to 
whom he had sold a particular car claimed it was the 
best car they had ever driven. If he had sold one 
thousand of these cars, the proportion 3/4 would 


represent people. On the other hand, if 
0/ 250 


he had only sold four such cars, only people 
would have told him it was the best car they had ever 


driven. 
Remember, 50% oi 1000 is , Whereas, 
50% of 100 is 


100 


25 


percent 


percent 


percentages 


20% 


750 


500 
50 
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139. 


140. 


141. 


142. 


It is easy to convert any frequency table into a table 
listing the proportion of times each value was observed. 
To do so, you simply divide each frequency in the 
frequency table by the total number of observations in 
your data. By doing so, you convert each 

frequency to a(n) 


absolute/ relative absolute/ relative 


frequency or proportion. 


Consider the frequency table shown below: 


VALUE PROPORTION 
red 
green 
blue 


The data represented in this table consist of 


observations of a variable, 


numerical/ non-numerical 


In the third column of the above table, we could list 
the proportion of observations having each of the three 
values. For example, the value "red" occurred —. 
times in the observations. Therefore, the 


proportion of times Red occurred equals divided 


by . 


The proportion of times that the value "green" was 
observed equals divided by , Which 
equals 3 
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absolute, relative 


10 
non-numerical 
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143. The of times the value "blue" proportion 
occurred in the data equals 5 divided by 10, or 1/2. 


144. 


VALUE FREQUENCY PROPORTION 


red 2/10 
green 3/10 
blue 5/10 


The numbers in the third column of this table are the 


we just calculated. proportions 


145. Each of these proportions tells what part of the total 
number of observations is represented by each value. 
If a particular value had a frequency of 0, you would 
know that none of the observations had that value; 
therefore, the proportion of observations having that 
value would equal 0 divided by the total number of 
observations, which means the proportion would equal 


146. Suppose you asked 100 people to predict which party, 
Democratic or Republican, would win the next 
Presidential election. Your data might look like that 


summarized in the frequency table shown below: 


FREQUENCY 


75 
25 


DEMOCRAT 
REPUBLICAN 
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146. (Continued) 


You could describe your data as consisting of 100 
observations of a variable you might call "predicted 
party," with "Democratic" and "Republican" the two 


possible of that variable. values 
147. The proportion of people predicting Democratic 3/4 
whereas the proportion of people predicting Republican 
was 3 1/4 
148. The number 75 indicates the frequency absolute 


absolute/ relative 
of the value "Democratic" in the previous data, whereas 


ə i R 
the number + indicates this value" s relative 


ç. relative/ absolute 


frequency. 


149. Just as the frequencies 75 and 25 describe the absolute 
frequency distribution of the previous data, the 
proportions 3/4 and 1/4 describe the relative frequency 
distribution of the previous data. 


If you tossed a coin 20 times and observed 10 "heads" 
and 10 "tails," the numbers and 1/2, 1⁄2 
indicate the relative (proportional) frequency 


distribution of your observations. 
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150. 


151. 


152. ' 


Suppose you asked 1, 000 people, rather than 100 people, 
to predict who would win the election, You might have 
obtained the data represented in the following frequency 
table: 


FREQUENCY 


DEMOCRATIC 


REPUBLICAN 


In the example where we asked 100 people, 3/4 of them 
predicted "Democratic" and 1/4 predicted "Republican. " 
In the present example of 1, 000 predictions, the 
proportion of people predicting "Democratic" is 

whereas the proportion of people predicting " Republican" 


is . 


The important point to be made here is that even though 
the frequency distributions were 


absolute/ relative 


different for the two examples, the 


absolute/ relative 


frequency distributions were the same. 


Let us consider another case in which we change an 
absolute frequency distribution to a relative frequency 


distribution by converting each frequency to a 
proportion. Suppose you have the data represented in 


the following frequency table: 


VALUE FREQUENCY 
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3/4 


1/4 


absolute 


relative 
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152. 
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158. 


(Continued) 


Note that this data consists of observations of 
10/12 
a variable. 


numerical/non-numerical 
Furthermore, the frequencies in the table are for 


data. 
grouped/ ungrouped 


The number 3 represents the number of times a value 


between and occurred in the data. 


The number of times a value between 11 and 15 was 


observed is . 


The group of values between and 
occurred more often than any other group of values 


in the data. 


The proportion of times a value was observed between 
6 and 10 is equal to divided by , which 
equals 1/2. 


The proportion of times a value between 0 and 5 was 
observed is , whereas the proportion of 
times values between 11 and 15 were observed is d 


Therefore, the frequency distribution 
absolute/ relative 


of the previous data is represented by the 3, 5, and 2, 
whereas the frequency distribution is 


absolute/ relative 


represented by the numbers 3/10, 5/10, and 2/10. 
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10 


numerical 


grouped 


3/10 
2/10 (1/5) 


absolute 


relative 
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159. 


160. 


161. 


162. 


TABLE A 


0- 5 5 
6 - 10 3 
11 - 15 2 


TABLE B 


VALUE FREQUENCY 


VALUE FREQUENCY 


300 
500 
200 


Of the two tables shown above, the table having the same 


relative (proportional) frequency distribution as the 


data shown in Frame 152 is Table 


Notice that Table B consists of data from 


AB 


observations. Thus, each frequency would be 


converted to a proportion by dividing it by 


Expressing the frequency of a value as a proportion 


indicates what part of the total number of observations 


were of that particular value. The largest possible 
proportion you could obtain would occur when all the 


observations in your data had the same value. This 


proportion would be 3 


For example, if the frequency of a particular value 
were 1, 000 in a collection of 1, 000 observations, the 


relative frequency (proportion) of observations having 


that value equals 


divided by 


şuor 1. 


If a particular value had a frequency of 0, it would be 
represented in a relative frequency distribution by a 


proportion of : 
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one 


1, 000 


1, 000 
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163. Notice that all of the proportions in a relative frequency 
distribution must lie somewhere between the largest 


possible proportion of and the smallest possible 
proportion of A 0 
164. Each of the proportions in a relative frequency 


distribution represents the part of the total collection of 
observations having a particular value. All of the parts 

of something added together must equal the whole. 

Therefore, the sum of all the proportions in a 

proportional distribution must always equal `  . 1 


165. A little thought will indicate why the total of all of the 
proportions in a proportional distribution must equal 1. 
Each proportion is a fraction whose numerator is the 
frequency of some value and whose denominator is the 
total number of observations represented in the data. 
The sum of all these fractions would simply equal the 
sum of all the numerators over this common 
denominator. However, the sum of all the numerators 
is the sum of the frequencies in the frequency table, 


which equal the total number of does 
does/ does not 


observations. Therefore, the sum of all of the fractions 


or proportions would always equal . 1 
242543 x 
For example, > + i * Š equals — , which 


10 
equals 10” or 


166. VVe stated earlier that a statistic is any number or term 
that summarizes a collection of data. Each of the 
proportions in a proportional distribution summarizes 
something about the frequency of a particular value in 
the collection of data. Therefore, each proportion 


be called a statistic. would 
would/would not 
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167. Of the three tables shown below, two are 


frequency tables and the other is a absolute 
absolute/ relative 
frequency table. relative 
absolute/relative 
TABLE A 


FREQUEN? 


0-100 sec. 
101-200 sec. 


0-100 sec. 


101-200 sec. 


FREQUENCY 


1/10 
9/10 


0-100 sec. 
101-200 sec. 


168. Table C is a relative frequency distribution describing 
the data in Table . Aor B 
A/B/Aor B 
169. Even though the enumerative statistics in Table A 


are different from those in Table B, both of these 
absolute frequency distributions can be represented 


by the same frequency distribution. relative 
(proportional) 
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170. 


171. 


Just as you could present an absolute frequency 

distribution on a graph, it is also possible to present a 

proportional distribution in graphic form. In the 

distribution graph, we represented the frequency of 

each value by the of a column. In this same 

way, we may represent each proportion in a proportional 

distribution. 

The highest possible column that could ever occur on 

a proportional distribution graph would represent a 

proportion of . The three graphs shown 

below represent the same data shown just previously 

in tables A, B, and C. Notice that Graph C is a 
frequency distribution graph, 

just as Table C was a proportional frequency table. 
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172. 


173. 


174. 


Notice that it is difficult to compare the two absolute 
frequency distributions in Graph A and B because the 
number of observations in each collection of data was 


the same/ different 


different 


The difference in the size of the collection of data could 
easily obscure the similarity of the two distributions. 
This similarity is the fact that the relative 


absolute/ relative 
frequency distributions of the two groups of time are 


the same. 


Therefore, if you wish to compare two collections of 
data when there are different numbers of observations 


in each collection, it is often useful to use a(n) 
or relative, 


absolute/ relative proportional 


frequency. 
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REVIEW I 


FILL IN THE BLANKS: 


ic 


If a value is not recorded in the data, the value has a 


frequency of 5 


A collection of frequencies ina frequency table is called 


a frequency 


Let us assume that we have two red marbles, five green 
marbles, and four blue marbles in a box. The 
proportion of red marbles is 


Let us assume that we have two groups in an experiment. 


Group A has ten brunettes and five blonds. Group B has 
twenty brunettes and ten blonds. The relative frequency 


the same/different j 


of the two groups is 


The largest possible proportion is 


The sum of the proportions in a proportional distribution 


must equal 


Totals 


EZ ə sə 
sc ra 


What is the score in the second row, fourth column? 
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zero 


distribution 


İs 


the same 
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MULTIPLE CHOICE: 


10. 


11. 


Alist of things, one underneath the other, is called a: 


a. column. 
b. row. 
c. radial. 


d. none of the above 


A list of things arranged side by side is called a: 


a. column. 
b. radial. 
Cc. row. 


d. none of the above 


If we list all the observations and do not summarize 


them, we have: 
a. finished data. 
b. theoretical distribution. 
c. raw data. 


d. none of the above 


A number, or term, that summarizes 


collection of data is called: 
a. raw data. 
b. a statistic. 
c. a theoretical distribution. 


d. none of the above 


or describes a 


112 


19€ 


quepuedoput E aq o) prs are ərp z Jo STJOI “ərojərəuL 
Or 1oqjoue Jo ouroojno ay} seururrojop ABM OU UT 

IIO1 ouo Jo ouroojno Əy} “ərp e [[Ox noÁ ua *Á[Ie[IUIIS “PST 
quəpuədəpur . put əq o) pres aae 


SƏSS0} OM} Ə) “IoJƏIƏ9UL -uroo au JO SSO} IƏYJOUL JO 
9uroojno əy} (əurunrəşəp şou səop) uo Surreaq ou sey uroo 
® JO SSO} ouo yo ouro2jno əy} “əqdurexə 104 -ou102jno 
puoəəs au yo 19jo9e1euo ən) uo 3urreoq ou Seu ouroojno 
əuo yo 19jo*e1euo əy} 'quepuedepur eo^ soauroojno OM} JI 
yey} IIHI pajejs ƏM -*Ssoood WOPULI € jo Souroojno 
quepuedepuj usamjoq drusuorje[ox sty} 3urjuoso1dou jo fem 

Teurroy v st Zoom Ayyıqeqodd jo epa uoryeordryrnur oy, “SST 


*SS0] puooas ay} uo 

PEY e pue SSO} ISAT] ƏY} uo peau V JO JSTSUOD [TTA Sasso} 

GZ josated “007 Arəaə yoşno” 10 ‘p ÁIƏAƏ9 yo 
I jno ` 3noqe ‘una Zuo] ay} up em sueour £ = (HI 


*SS0} puooəs ou uo peau VL pue SSO} 3szTJ ən) uo Dean 
* JO 3SISU02 [[I^ Sasso} yo sated (001 K1eAe Jo mo es 10) 
p ArəAə Jo yno T A[joexo yey} oam Jou seop £ = (HH)1d "Ser 


`e) € tu pue pue Dean e uyu ur8əq prm sesso; 
osz JO sated “000 T Kreae Jo 3jno —  3noqe ao ‘QT 
&roA9 JO mo GZ jnoqe ro “rno Daag JO mo ouo jnoqe yey} 

ueəur o) pojo1drojur oq ULI p /T = (IH)iq yem) IION "TGT 


*euo renbə prnoA seuroojno ay} 

Jo wns əy} pue 'Ajrrqeqoad eures əv) oAeq uəu1 prnom 

F/I eurojno gove souls ^  — @Qrrtqeqo:d am Sauio2jno 
INO} əsən) Jo qəvə uSISse o) e[qeuosva1 urəəs p[noA J “OST 


12. 


Frequencies are often called: 
a. raw data. 
b. inoperative. 
c. enumerative statistics. 


d. none of the above 


TRUE OR FALSE: 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


VVe refer to something that does not change during an 


experiment as a constant. 


Something that does change in an experiment is 


called a variable. 
The numbers one to ten are written on pieces of paper 
and placed in a hat. The number 4 is drawn from the 


hat. The number 4 is an observed value. 


Let us assume that we have an ordinary die. On such a 
die, the number " is a possible value. 


The number of nickels in a cookie jar would be an 


example of a continuous variable. 


Numerals can only be used to represent numbers. 


Only a list of observed values can be referred to as 
data. 
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—n—— P hr rc —-—————— — 


The review that you have just completed can assist you in evaluating your 
progress at this point in the program. If you had no difficulty with the review, 
proceed to the next section. If you did have trouble with any of the review questions, 
return to the place in the program where this material is presented and make sure 
you understand the material before going on to the next section. 


Follow this procedure with each of the reviews in the program. 
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Section III: Central Tendency 


You have seen how any collection of data can be regarded 
as a distribution of values. By "distribution" we mean 
the number of times each of the possible values has 

been 2 


Next, we shall consider some of the ways in which 
distributions differ and how these differences can be 
deseribed. Consider, for example, the two distributions 
Shown below. 


x 5 * 

z 4 z 4 

B3 53 

Sr g 

ar 5, 
12345 9 I 3734 5 6 

VALUE VALUE 

DISTRIBUTION A DISTRIBUTION B 


You would probably say that the typical observed value 
in Distribution B is than the typical 


larger/ smaller 


value of Distribution A. 


While the two distributions have approximately the same 


shape on the graph, the center of Distribution B is at 
a value than the center of Distribution A, 


larger/ smaller 
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You could say the central tendency of Distribution A is 

different from the central tendency of Distribution B. 

In other words, if the values in one collection of data 

are generally larger than the values in another collection 

of data, you could say the distributions of the two 

collections of data have central a different 


the same/a different 
tendency. 


Consider the three collections of data listed below: 


Data A: 2, 4, 3, 12, 6, 4 
Data B: 10, 12, 14, 16, 50 
DataC: 1, 3, 2, 3, 2, 6 


Notice that the values in Data A are more similar to the 


values in Data than they are to the values in Cc 
C/B 
Data ; B 
C/B 
The distributions of Data A and Data B seem to have 
central tendencies, whereas the different 
similar/ different 
distributions of Data A and Data C have similar 


similar/ different 


central tendencies. 


If the values in two distributions were quite similar, we 


would describe the two distributions as having similar 
central tendencies 


Consider the two sets of data shown below. 
Data A: 4, 3, 6, 6, 4, 6 
Data B: 21, 23, 20, 21, 23, 20 
You could describe Data as having a larger typical B 
A/B 


value than the other collection of data. 
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(Continued) 


Suppose the two collections of data represented the ages 
of people at two different parties. In other words, 


Data would have been collected at a children' s 
A/B 


party and Data would have been collected at a 
B 


party for young adults. 


- A person who was 5 years old would be more representative 


of people at a party corresponding to Data than 
A/B 


he would be to the data collected at the other party. 


Even though it does not occur in Data A, the number 5 is 
more representative or typical of that collection of data 
than it is of the other collection. In a similar sense, the 


number 22 is more typical of Data than it is of 
A/B 


the other collection of data. 


The two previous collections of data differ mainly in the 
magnitude or size of the recorded values. This can be 
illustrated by the two graphs of raw data shown below. 


25 25 
20 20 
B 15 P 15 
d 10 d 10 
EC 5 PN 
0 0 

KSE 0 13625531 450776 

OBSERVATION OBSERVATION 

DATA A DATA B 
Notice how the values in Data all seem to be close 
A/B 


to (cluster around) the value 22, whereas the values in 


the other data seem to cluster around the value T 
5/10 
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Suppose a student named John received grades of 75, 
70, 76, 75, 76 ina course. Another student, 
Dick, received grades of 95, 95, 96, 75, 98. A grade of 
75 would be more typical or representative of 
grades, even though both students had 


John' s/ Dick" s 


received a grade of 75 during the course. 


The reason a grade of 75 is more typical of John' s grades 
is that a score of 75 was unusual for Dick. Most of 


95/72 


Dick" s grades seem to be clustered around 


We would say, therefore, that the central tendency of 
John' s grades seems to be than the central 


higher/lower 


tendency of Dick" s grades. 


If someone asked you to characterize John’s work by 
reporting his typical grade, you would be more likely to 


answer than you would ; 
15/85 15/85 
100 
DISTRIBUTION A Ü 
æ] 
E 
E 
E o 
RO0U1234567891 11213115 
y, 100 
DISTRIBUTION 8. Ü 
m= 
E 
° 
E 


0-T'2 3 4567819111 I 


Notice how the values in Distribution À seem to 


be clustered around the value whereas the values 
5/10 


in Distribution B are clustered around the value 5 ei 
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19. 


In other words, you could describe the difference 
between the two distributions shown in the previous 
frame by pointing out that the central tendency of one 
distribution was different from that of the other. For 
example, if you said you were referring to the 
distribution whose typical or central value was 10, it 
would be clear that you were referring to Distribution 


rather than to the other distribution. 
A/B 


The characteristic of distributions we have been referring 
to as the "typical value" is generally a value near the 
"center of the distribution. In other words, we can 
usually think of the values in a distribution as being 
clustered around (close to) a typical or central value. 
That is why we refer to this characteristic of a 


distribution as its central 


Up to this point we have not been very specific about 
what we mean by "central tendeney." We have purposely 
not done so because there is more than one way to define 
the typical value or central tendency of a distribution. 

In other words, there more than one acceptable 


is/is not 


way of defining the central tendency of a distribution. 


If one value occurred more frequently in a distribution 
than any other value, you might consider that value the 
most common or typical value of the distribution. 
Therefore, one way of characterizing the central tendency 


of a distribution be to report the most 
would/ would not 


frequently occurring value in that distribution. 
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Statisticians use the most frequently occurring value in 

a distribution to characterize the central tendency of 

that distribution. Statisticians call this most frequent 

value the mode of the distribution. Thus, the value having 

the highest frequency in a frequency table would 


would/ would not 
be called the mode of the distribution represented in that 
frequency table. 


D 


2 
3 
4 
5 


The value occurring most frequently in this collection of 


data is , Since this value has a frequency of $ 40, 3 


The mode is the most frequently occurring value in a 
distribution, therefore, 40 is the of the mode 


distribution shown in the previous table. 


The number could be used to characterize the 40 
central tendency of the distribution shown in the previous 
table if you used the mode to characterize the central 


tendency. 


The column on a frequency distribution highest (tallest) 


graph indicates the most frequently occurring value in 


that distribution. 
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27. 


(Continued) 


The value having the highest column in the graph shown 
above is - We could characterize the central 
tendency of this distribution by reporting that its 


was 5. 


You can indicate that 5 is the most frequently occurring 
value in the previous data by saying that 5 is the modal 
value. Thus, if the mode of a distribution is 12, you 
would say 12 was the m (most frequently 


occurring) value in that distribution. 


Suppose you were considering a distribution in which the 
largest frequency was 10. Suppose, however, that more 
than one value had a frequency of 10. In this case, both 
of these values could be referred to as the mode of that 
distribution. For example, suppose you had a collection 
of data consisting of 10 observations of a variable with 
3 possible values: a, b, andc. If the frequency of both 
a and b was 4, and the frequency of c was 2, you 

say that both a and b were modes in that 


could/ could not 


distribution. 


Consider the table of data shown below. 


Observations Value 


Ne 


CO Ç — O5 0 Hc 


= 


The value 11 has a frequency of 
The value 21 has a frequency of T 


The value 33 has a frequency of 
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29. 


(Continued) 


The two modes in this collection of data are the values 
and C I1 21 

It would be possible to find the value that occurred more 

frequently than any other value in either numerical or 


non-numerical data. 


Therefore, a collection of non-numerical data can be 

described in terms of the modal, or most frequently 

occurring value. For example, if you tossed a coin 

5 times and observed 3 " heads" and 2 "tails," you could 

say was the modal (most frequently heads 


heads/tails 


occurring) value in your data. 


Although the mode is often a useful way of characterizing 
the central tendency of a distribution, it is sometimes 
misleading to describe a distribution by its mode. 
Consider the two frequency distribution graphs shown 


below: 
Š 50 Š 50 
40 40 
H 30 Ë 30 
20 g 20 
E 10 Ë 10 
012345 012345 
VALUE VALUE 
GRAPH A GRAPH B 
Notice how the mode of Distribution is near the A 


A/B. 
center of the distribution and is typical of the values in 
that distribution, whereas the mode of the other 
distribution is not near the center of the distribution and 
is not particularly typical of the values observed in that 


collection of data. 
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31. 


32. 


Partly because the mode of a distribution is not always 
representative of the central or typical value in a 
distribution, statisticians have defined other ways of 
characterizing the central tendency of a distribution. 


Another way of representing central tendency is to report 
a value which is smaller than the same number of 
observations as it is larger than. This value is called 
the median of the distribution. For example, suppose 
your data consisted of the observations 4, 5, 7, 8, and 


10. The value 7 would be greater than of the 
3/2 


remaining observations and smaller than of the 
3/2 


remaining observations. Therefore, we could call 


the median of these five observations. 


The easiest way of finding the median of a collection of 
data is to list all the observed values in the order of their 
Size. This procedure is called ranking the data. If your 
data consisted of the values 4, 3, 8, and 7, then the list 


of values 8, 7, , and would be a ranking of 
3/4 3/4 


the data. 
1 you were to rank the values 10, 6, 11, and 4, you 


would start with the largest value and end with the 


smallest to form the list , F , and 
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34. 


35. 


36. 


You could rearrange the observations in a table of 

raw data, arranging the values in the order of their size 
rather than in the order in which they were observed. 
Thus, we would be forming this new table by ranking the 


observations. 


Notice that the data shown in Table B 


were/ were not 


formed by ranking the data in Table A. 
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1 
2 d 2 
3 11 3 
4 8 4 
TABLE A TABLE B 
Table D (shown below) formed by ranking 


was/ was not 


—- 


the data in Table C. 


d 
> r 2 
3 11 3 
4 8 4 
TABLE C TABLE D 
We stated earlier that the median value in a collection of 
data was a value which was than half of the 
other observed values and than the other 


remaining values. 


One way of finding the median of a collection of data is to 
rank the data and locate the value which divides this list 
of ranked values in half. If 10, 7, 6, 2, and 1 were a 
list of ranked values, the value — would divide the 
list in half. Two of the other values would be larger 


than 6, whereas of the remaining of the other 


(number) 
values would be smaller than 6 
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were not 


Was 


smaller 


larger 
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38. 


(Continued) 


Since 6 is larger than half of the remaining values and 
smaller than the other half, the median would be 5 


It is a simple matter to find the median of a distribution 
when you have an odd number of observations. You 
simply rank the observations and find the middle value 
in this list of ranked observations. This middle value 
would be the of your data. 


If you had an even number of observations, there would 
not be a value in the list of ranked data such that the 
same number of observations fell above and 


below that value. 


To illustrate this problem consider the table of ranked 


data shown below: 


Notice how we have indicated the rank of each value in 
the first column of the table. The largest observed 
value has a rank of and the smallest observed 


value has a rank of š 
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Notice also that there are observations larger 
than the value 40 and observations smaller than the 


value 40. Thus, 40 the median. There are — 


is/is not 


observations larger than the value 30 and _ _ observations 
less than the value 30, indicating that 30 the median. 


is/is not 
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(Continued) 


Neither 30 nor 40 is the median, since too many values 


are smaller than whereas too many values are 
30/40 


larger than 


30740 


Strictly speaking, any value between 30 and 40 could be 
called the median of this data. However, statisticians 
have agreed upon a rule for finding the median of an 
even number of observations. They would say that the 
median of the previous collection of data was a value 
halfway between 30 and 40. In other words, 35 

be called the median, because 


would/ would not 


35 is halfway between 30 and 40. 


Suppose 8, 6, 4, and 3 were ranked data. The value 
halfway between 6 and 4 would be the value 2 
Therefore, 5 would be the of this data. 


Three lists of ranked data are shown below. 


Data A: 5, 4, 1 

Data B: 6, 3, 2 

DataC: 5, 3, 1, 1 
The middle value in Data A is the value 4; therefore, 4 
would be the of Data A. 


The middle value in Data B is the value - This value, 


therefore, would be the of that data. 


126 


40 


30 


would 


median 


median 


median 


wopuet 


ou 


syque} p 
qu 


ESE 


enjeA ay} yovordde o} 

sievodde (yeq e samt [erorgjo) suoTyeAresqo jo zəqumu 
oS.1v[ ATƏA € ur syy JO uorjrodoad əy} yey} serdur 
Aytiqeqoid emt, :ssooodtdd C ÜSU 
Əouvurrorrəd sty MaTA NOA jr ə9Uu%urIoj:əd s, rə11eq ou 
SUIZIJƏJIVIVYI Jo AVM e St (g*) WY € Jo Anqeqomd ayy, 


u’ PY OU, 

əwoəmo ou o). ÁprItqeqoid əm) pue Am. 
euroojno ay} o) g: Aj[Iqeqoad ən) u3rsse pinoə noÁ qeq 

ye Sault} jo 1equinu IIF e 1ojpe g: en[eA ay} pauovoadde 
Qi93jeq re[norjred sty} Aq syy jo uorj.rodoad ayy əours 

(' aly OU, S? yy E ütü) Zoo 3urqjAue səpisuoə [TTA 9M) 
mm 40110 :jeq je aur v Jo sauroojno aArjsneuxo 
pu? əAfsnrəxə Á[[enjnur OM} ƏY} JO jSrsuoo p[noa 
senmrqeqoad uSrsse prnoo nod uorua o} eoeds ejdures 

euL -:uornnqrustp Ajrtqeqoud e se ooueurrojiod stu 
jueseidei mär nog - eur ay} Jo süşüəş-ç jnoqe Vy € 
peurejqo peu pue sowy jo 1equinu 37943 e yeq ye uəəq peu 
19jjeq ay} əsoddns -jeq ye seur jo 1oquinu Domm 

ue jo 3urjstsuoo uorje[ndod əşruryür ue uro1j uorjeA1osqo 
Teuorjrppe ue s? jeq je əv əvə AƏTA Sru nox -889001d 
UropueI € SE FY € pəure)qo 193jeq ie[norjied V jou 10 
QieujeuA peururrojop yey} ssəoor:d ay} pƏAƏIA nod asoddng 


007 /S4}U9}-F 


“yeq ye səum em jo 
Dean — Sg uyo pey ou jeu) vam 
PINOM st} “007” jo oSe1oAv Sumyeq e peu səyüq en 
"006: "emm uejjo ST Y Se “IO g: oq pino oSve1oAv 
SUA Sty ‘SHU g ano pue seurn OT Wq o) uəəq seu oq 
Jt ‘sprom zayjo ul -aSeraae Sun?eq STU se o) pa1ioJoi 
Dog ST eq 0} üəəq Alrerəryo seu eu seurr ou [Te Jo Mo 
Vy € u93303 sey səAvid ITeqeseq e səum jo uorjz0odoad au 


WI 


“SOT 


“POT 


41. 


42. 


43. 


44. 


45. 


(Continued) 


Notice that Data A and B consist of an number 


odd/ even 
of observations, whereas Data C consists of an 


number of observations. Using the rule for 
odd/ even 


data consisting of an even number of observations, we 
would find the value halfway between and $ this is 
the value . Therefore, we would say the median 
of Data C is . 


Consider the table of data shown below: 


The largest value in this table is and the 


smallest value is * 


If we rank the data in the preceding table, our list of 
values be 200, 160, 180, 100, 30. 


would/ would not 


The previous list was incorrectly ranked. The value 
"180" should have come before rather than after 


it, since 180 is larger than " 


We refer to the largest value in a table of ranked data 

as having rank 1. The value having rank 1 in the previous 
collection of data would be. The value having 
rank 2 would be —  . 
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odd 


even 


200 
30 


would not 


160 
160 


200 
180 
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46. 
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48. 


49. 


50. 


51. 


The median of the previous collection of data is the 
value , Since the values and are 
larger and the values and are smaller 


than this median value. 


Hovvever, if the values in the previous collection of data 
were 200, 180, 160, and 100, the median would be a value 
halfway between and - 


To find the difference between 180 and 160, you subtract 
160 from 180. Half of this difference is , which 
is what you would add to the value 160. "Therefore, the 


median is > 


We have now considered two ways of representing the 


central tendency of a distribution: by the and 
by the 5 
The of a distribution is the most frequently 


mode/ median 


occurring value in the distribution, whereas the 
is a value that is less than half of the other 


median/ mode 


values and greater than half of the remaining values. 


Suppose your data were the values: 


10, 5, 1, 10 and 4. 
The median of the distribution is the value ; 


whereas the mode is the value : 
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160, 180, 200 
100, 30 


160, 180 


10 


170 


median 


mode 


mode 


median 
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52. 


53. 


54. 


55. 
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= 


The mode of the collection of data shown above is 


Since this value occurs times. 


The value 9 the median of the previous 
is/ is not 


collection of data since there are observations 
with values larger than 9 and observations with 


values smaller than 9. 


While 9 is too large to be the median, 7 is too 


to be the median. 


To find the median of the previous data, you would find 
the value halfway between and - Therefore 
the median of the previous collection of data is è 


The median (like the mode) can sometimes give a 
misleading picture of a distribution. For example, 
consider the two collections of data shown below: 

Data A: 100, 99, 98, 97, 96. 

Data B: 100, 99, 98, 4, 2. 
The median of Data A is and the median of Data B 


is 
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is not 
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56. 


57. 


While both distributions have the same median, the 


values below the median in Data differ much 
A/B 


more from the median than do the values above it. 


The median only indicates the value dividing the list of 
ranked data into two equal parts. The median does not 
indicate how much smaller or how much larger are the 
values falling above it or below it in the list. This 
can be illustrated by the following raw data graphs. 


GRAPH A 


VALUE 


1752773074 5677 8-9 
OBSERVATION 


GRAPH B 
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90 

80 

70 

60 

VALUE 50 
40 

0 

0 


1) 32 O45 265.7787 9 
OBSERVATION 


Notice that in both collections of data — — — 
observations had a value larger than the value of 
observation 3and ^ observations had a value 
smaller than the value of observation 3. Thus, 

is the median of both distributions since it is the value 
of observation — . 
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58. 


59. 


60. 


61. 


The table of data shown below contains 
observations of a variable. 


numerical/ non-numerical 
Observation Value 


The graph of raw data shown below 
does/ does not 


represent the data in the previous table. 


VALUE 


O = o $o HO O —<1 Co 


1 2 3 
OBSERVATION 


The previous graph does not represent the previous 
table of data, since observation 1 had a value of 5 and 
observation 2 had a value of 8 in the table, whereas 
observation 1 had a value of and observation 2 
had a value of `  inthe graph. 


The graph of raw data shown below 
does/ does not 


represent the data shown in the previous table. 


VALUE 


QC = b 0546 n o — $o 


1 2 3 
OBSERVATION 
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61. 


62. 


(Continued) 


All the observed values in the previous graph 


are/ are not 


the same. 


Differences in the observed values are indicated on the 


graph by differences in the of the 3 columns. 


Thus, the value of observation 1 is clearly less than 
that of observation 2, since the height of column 1 is 
less than that of column 2. Similarly, the value of 


observation 2 is than the value of 
less/ greater 

observation 3, since column is higher than 

column 


While you can always compare the value of one 
observation to the value of another observation, it 
wouldn't make much sense to simply say, "Observation 
3 is less than." The immediate question would be: 
"Less than what?" It is often convenient to pick a 
reference value for comparison with the observed values. 
For example, suppose you chose the value 4 as a 
reference value. You could then describe the data in 

the previous graph by saying, "Observation 1 is greater 
than 4, observation 2 is greater than 4, and 


observation 3 is than 4," 
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63. 


64. 


The relationship between the reference value 4 and the value 
of each of the three observations is made clear by the 


following graph. 


— -REFERENCE 
VALUE 4 


VALUE 
CD n b c» gi O> -3 CO 


1 2 3 
OBSERVATION 


While this graph represents the same data as does the 

previous graph, we have indicated the reference value 4 

with a dotted line drawn across the graph at a height 

equal to the value ` . We have also indicated the 4 
difference between the reference value and each observed 

value with an arrow. The arrow in column 1 is pointing 

upwards, since the observed value 5 is greater than the 

reference value 4. Similarly, the arrow points 


in column 2, since the value of observation 2 up 
up/down 
is than the reference value 4. greater 


greater/ less 


The value of observation 3 is than the less 


1ess/ greater 
reference value 4. Therefore, the arrow on our graph 


points . Notice that the difference between the down 
up/ down 


value of observation 1 and the reference value is less 
than the difference between the value of observation 2 
and the reference value. This is indicated on the graph 


by the fact that the arrow in column 1 is shorter 


longer/ shorter 


than the arrow in column 2. Thus, the size of the 
difference between the observed value and the reference 
value is indicated by the of the arrow length 


representing that difference. 
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65. 


66. 


The length of the arrow represents the difference between 
the reference value and the observed value, regardless 
in which direction the arrow is pointing. Thus, the 
arrow points up in column 2 since the Second observed 
value is greater than the reference value, whereas the 
arrow points down in column 3 because the third 
observed value is than the reference value. 
However, the value of observation 3 is closer to the 
reference value than is the value of observation 2. This 
is indicated by the fact that the arrow in column 

is than the arrow in column 2. 


shorter/ longer 


Differences between observed values and a reference 
value (represented with arrows in the previous graph) 
are referred to as deviations by statisticians. Ifa 
value is greater than the reference value, that value has 
a positive deviation from the reference value. Ifa value 
is less than the reference value, that value has a 
negative deviation from the reference value. Since the 
value of observation 1 (on the previous graph) is greater 
than the reference value 4, you would say that the value 
of observation 1 had a positive deviation from the 
reference valte 4. Similarly, since the value of 
observation 2 is than the reference value 4 


greater/ less 


you would say the value of observation 2 had a 
deviation from the reference value 4. 


positive/ negative 


Since the value of observation 3 is less than the 
reference value 4, we would say the value of observation 
3 had a deviation from the 


pos itive/ negative 


reference value 4. 
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67. 


68. 


69. 


The amount by which a particular value deviates from 
the reference value is indicated by the length of the 
arrow in the previous graph. Whether the deviation is 
positive or negative is indicated by the direction in 
which the arrow points. In other words, whenever an 
observed value is greater than the reference value, the 


indicating a 
up/ down positive/ negative 


deviation. Whenever the observed value is less than the 


arrow will point 


reference value, the arrow will point indicating 
up/ down 


a deviation. 


positive/ negative 


The graph of raw data shown below represents 2 
observations having a value of and . 


6 
5 
4 
VALUE 3 
2 
1 
: 1 2 
OBSERVATION 


Suppose you chose 0 as the reference value for 
describing the two observations in the previous graph. 


Both of the observed values would be than 


greater/ less 


the reference value and would represent e s 
positive/ negative 


deviations from the reference value. 
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70. 


T1. 


72. 


Suppose you choose 8 as a reference value. Since the 
value of both observations is than 8, both less 


greater/ less 
negative/ positive 


deviations would be negative 


Suppose you chose a reference value somewhere between 
6 and 2. The value of observation 1 would represent a 


deviation from the reference value, positive 
positive/ negative 
whereas the value of observation 2 would represent a 

deviation. negative 


positive/ negative 


We have redrawn the preceding graph and indicated below 
the deviation of each observation from a reference value 


of . 


VALUE — REFERENCE VALUE 


Ço t> $o > n o; 


1 2 
OBSERVATION 


The reference value is closer to the value of observation 
2 than it is to the value of observation l. This is 
indicated by the fact that the arrow pointing up is 


than the arrow pointing down. We can longer 
longer/shorter 
calculate the actual size of the deviation simply by 
subtracting the reference value from the observed value. 
For example, you would calculate the deviation of 
observation 1 from the reference value by subtracting 3 


from . 
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73. 


74. 


75. 


76. 


T1: 


The deviation of observation 1 from the reference value 3 
equals 3 subtracted from 6, which equals . 3 


Similarly, you would calculate the deviation of 
observation 2 from the reference value 3 by subtracting 
3 from S 2 


Notice that 2 minus 3 equals -1. This means the deviation 
of observation 2 from the reference value 3 is A Sch 


-1/41 
When we subtract a reference value from a smaller value, 


our answer will be negative. Therefore, all negative 


deviations will be represented by negative 


positive/ negative 


numbers. On the other hand, all positive 


deviations will be represented by positive numbers, since 
the observed value is, in this case, greater than the 


reference value. 


In the table shown below, we have summarized the 
information concerning deviations of the observed values 


from the reference value 3. 


OBSERVATION VALUE 
1 
2 


We have simply added another column to the previous 
table of raw data and listed the deviations of each of the 


observed values from the value — — . Thus, the 3 
numeral -1 in the last row of the third column represents 
the deviation of observation 2 from the reference value 3. 
We obtained the deviation of observation 2 from the 
reference value 3 by subtracting  Irom , to 3, 2, 


give us an answer of _ — 
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82. 


83. 


The following table contains the same two observations. 


However, we have left room to indicate deviations from 


the reference value rather than 3 (as in the 


Ee table). 


VIATION 
loaseRvarion | RVATION VALUE bro 4 


To find the deviation of observation 1 from the 


reference value 4, you subtract 4 from ` 


6/2 
Therefore, the deviation of observation 1 from the 
reference value 4 is 5 
The following table 1715 35i correct. 
is/is not 


DEVTATION 
OBSERVATION LUE FROM 4 


The previous table is incorrect because observation 2 


was smaller than the reference value. Therefore, the 


deviation of observation 2 from the reference value 


would have to be represented by rather than 
2. 
The following 20. correct. 

is/is “is/ is not 


DEVIATION 
OBSERVATION VALUE OM 4 


1 
2 
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85. 


86. 


87. 


88. 


There is something special about using 4 as the 

reference value. The positive deviation represents 

the same difference between the observed value and the 

reference value as does the negative deviation. In other 

words, if we represented deviations on a graph as we 

did previously, the length of the two arrows would be 
(although one arrow would be 


the same/ different 


pointing up and other down). 


Because the reference value 4 has the unique property of 
being as close to the value 6 as it is to the value 2, we 
say that 4 is the mean of the values 6 and 2. Since 6 and 
2 are the two observed values in the previous collection 
of data, we could say that the mean of that collection of 
datais ——. 


If we added the positive deviation from 4 and the negative 
deviation from 4, our answer would be because 
+2 added to -2 equals . 

Another way, therefore, of describing that unique 
characteristic of the reference value 4, which makes it 
the mean of the previous collection of data, is to say the 
sum of the deviations from 4 equals — 


Suppose your data consisted of 3 observations instead of 
only 2 — for example, 8, 3, and 1. If“ you chose 10 as 
your reference value, all the deviations would be 


. If you chose 0 as your reference 
positive/ negative 


value, however, all the deviations would be 


positive/ negative 


DEVIATIONS 
OBSERVATION | VALUE FROM 6 
1 8 
2 3 
3 1 
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the same 


negative 


positive 
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93. 


To check the deviation of observation 3 in the previous 
table, you would subtract the value from the 
value 8. This would indicate that the deviation in 
the table was D 


correct/ incorrect 


We mentioned that a reference value is called a mean 
when the sum of the deviations from that reference value 
equals 0. For the set of data we considered earlier, for 
example, the deviations from the mean value 4 were +2 
and -2, giving a sum of deviations equal to (+2) + (-2), 


which equals ° 


Even when the data consist of more than two observations, 
we can define the mean in the same way. In other words, 
if we add all the deviations from a particular reference 
value and our answer is 0, that reference value is the 
mean of those observations. Consider the previous table 
of deviations from the reference value 6. To find the 
sum of the deviations, we would add +2, -3, and -5, 


which yields an answer of S 


Since the sum of the deviations of our three observed 
values from the reference value 6 does not equal 0, the 
value 6 the mean of these three observations. 


is/is not 


We could record the deviations of each observation from 
the reference value in the following table: 


DEVIATIONS 
OBSERVATION VALUE FROM 4 
1 8 
2 3 
3 1 


We would find the deviation of observation 1 by subtracting 


from , indicating that the deviation of observation 1 


from the reference value was 
140 


correct 
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In the same way, we would find that the deviation of 
observation 2 was and that the deviation of 


observation 3 was . 


We have recorded the deviations from the reference 


value 4 in the following table: 


DEVIATIONS 
OBSERVATION VALUE FROM 4 


1 
2 
3 


We said that the mean or average of a group of numerical 


observations is that particular reference value yielding 
deviations whose totalis 0. To find the total of the 


deviations from the reference value 4 for the three 


observations in this table, we would add 5 , and 
Thus, 4 the mean of this group of observations. 
is/is not 


We can illustrate this graphically as follows: 


8 POSITIVE NEGATIVE 
7 DEVIATION DEVIATION 
ui 
25 
d 4 
ES 
2 
1 
02078 
OBSERVATION 


The arrow labeled "positive deviation" represents the one 
positive deviation in the graph. The two downward 
pointing arrows, which are connected together, represent 
the two negative deviations in the graph. In order for the 
sum of the deviations to equal 0, the positive deviations 
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98. 


(Continued) 


must exactly balance the negative deviations. In other 

words, the total length of the arrows representing the 

positive deviations must be exactly the as the same 
total length of the arrows representing negative 

deviations; this is apparently true when the reference 

value is 4. Thus, ^ is the mean of these three 4 


observations. 


Consider the following two graphs. 


E B 

2 2 

d 2 

< < 

E > 
12 3 12223 

OBSERVATION OBSERVATION 
GRAPH A GRAPH B 


Remember, the mean is that reference value from which 
the sum of the deviations equals 0. Keeping this in mind, 


it would appear that the reference value shown in Graph B 
A/B 


was more likely the mean than is the reference value 


in the other graph. 


The reference value in Graph B is more likely the mean 

because the two positive deviations added together almost 

equal the negative deviation in size, whereas the two 

positive deviations in Graph A added together appear to 

be than the size of the negative larger 


larger/ smaller 


deviation in that graph. 
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99. The mean is a useful way of representing the central 
tendency of a distribution. However, we do not yet 


have a simple way of calculating the mean from an 

actual collection of data. Of course, you could try one 

reference value after another, calculating the deviations 

from each of these reference values. You could 

eventually locate a reference value from which the sum 

of the deviations equaled , which would zero 
indicate that reference value was the mean of the 


collection of data. 


If the collection of data consisted of only a few 

observations,you could probably find the mean in this 

manner. However, if the data consisted of many 

observations, the procedure outlined in the previous 

frame would not be practical Therefore, we will 

consider a procedure whereby we can calculate the 

mean of any collection of data according to a simple 

rule. A rule for calculating the value of some statistic 

is called the formula for that statistic. Thus, a rule 

for calculating the mean would be called a formula 


for the mean. 


100. In order to discuss ways of calculating specific 
statistics, it is often useful to talk about data in general 
rather than about particular observed values. For 
example, consider the two tables of raw data shown 
below. 

TABLE A 230. B 


1 


2 à 
3 3 
4 4 


Each table lists the values for observations of a 4 
476 

numerical variable. (You might wish to insert a book- 

mark here since we will refer to these tables in later frames. ) 
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The term "observation 1" specifies a particular value 
in each table. It specifies the value 20 in Table A and 


the value _ in Table B. 


Similarly, the term observation 3 refers to the 


value in Table A and the value in Table B. 


Instead of writing out observation 1 or observation 2 


statisticians have found it simpler to use the symbol Xi 


to represent the same value as does the term 


observation 1. Thus, X represents the value 20 in 


the previous Table A and the value in Table B. 


The number which appears after and just below 
the capital letter X in X, is called a subscript. The 


subscript indicates which particular observation you are 


representing. Thus, the subscript in Xi 


indicates you are representing observation 1. 


Notice that the symbol X, has a s 2 instead 
of a subseript 1. 
Similarly has a subscript 3 instead of a 


X, /Xs 


subscript 2. 


Since the symbol X, indicates the "observation 1," you 


would use the symbol x, to represent " observation . 


Notice that the symbol x, can represent the 


HI 


first/ second 


observed value in any table of raw data, regardless of the 


particular value of that observation. 


144 


10 


42, 0 
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In the same way, you could represent the third observation 
in any table of raw data by . The subscript X, 3 
X X 
3 
indicates you are referring to observation 3. 


Considering the two previous tables of data, the term X, 


would refer to the value 20 in Table A and the value 10 
in Table B. 

X3 would refer to the value in Table A and the 42 
value in Table B. 0 


Notice how you could represent any collection of three 


observations with X, Xs, and Za 


If your data consisted of Xp X» Xy and X,, then there 


would be observations in your collection of data. four 


Statisticians often use a capital letter N to represent the 
number of observations in a collection of data. If the 
collection of data consisted of four observations, N 
would equal 4. Ifthe data consisted of ten observations, 


N would equal : 10 


Since N represents the of observations in a number 
table of raw data, you could represent any collection of 
N observations with a column of X' s starting with X, 


then Xo and so on, until Xx 


This way of representing tables of data is useful because 
you can describe a general rule or formula for calculating 
some statistic without speaking of particular values. For 
example, you could represent the procedure for getting 
the sum or total of a collection of five observations with 


the formula: 


Totale X, + X,+__+ + Ze Xs X, 
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You could write a general formula for finding the total 
of any collection of data consisting of three observations 


as follows: 
Total = Xi * Xə * X4 


It would become awkward to write a formula of this sort 

if the collection of data consisted of very many observations. 

For example, if the data consisted of one hundred 

observations, we would have to write out a string of X 

values with plus signs between them, beginning with Xi 

and ending with an X having the subscript  . 100 


One way you could simplify the formula would 
be to write out the phrase: sum all the X' s, which would 
mean together the values of all of the add 


add/ multiply 


observations in the table of data. 


Statisticians have a special shorthand way of writing the 
phrase: sum allthe X's. Instead of writing out the 
whole phrase, they simply write D X. Thus, if your 
data consisted of two observations, Dix: would equal X, 
plus Xo. If your data consisted of three observations, 


x vvould equal + + 3 Xp X», X 


You are now in a position to write a simple 

formula for the total of all the observations in any 
collection of data. Using the shorthand way of writing 
" gum all the X' s," you could write the formula for the 


total as 


Total = x E 
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The symbol on is a capital Greek letter called sigma. 

Thus, the Greek letter in the expression ə x sigma 
indicates to you that the expression represents the _ sum 
of all the X' s (observations). 


The Greek sigma 02) is often referred to as a 
summation symbol, since D: X represents the sum 


of the observed values. 


Let's return now to the original task of finding a formula 
for the mean. Basically, this involves stating the 
formal relationship between the mean of the data and the 
observed values by a mathematical equation, and then 

(by simple algebra) rearranging this equation until it is 


in the form of a formula for the mean. 


We shall begin by considering a collection of data 
consisting of three observations. In other words, 
N= for this collection of data. three 


Just as we can represent observation 1 by X without 
indicating any particular value, we will represent the 
mean by x. Inthis way, we can represent the mean 


without indicating any particular for the mean. value 


You could represent the deviation of observation 1 from 
the mean as X -x. Ina similar way, you would 


represent the deviation of observation 2 from the mean 


as( = X). X) 


If the collection of data consisted of three observations, 
you could represent the sum (total) of the three deviations 


from the mean by Gu - x) + (X, - x) + ( + ocu). Xy 


xl 


If your three observations had the values 10, 2, and 4, 

you could put these actual values in the previous formula 

and write it as (10 - x) +(__-  +(__-»®)- 2. 4 
147 


AyTIqeqoad 


AyTIqeqoad 


eArSn[oxo ÁA[[enjnur 


Seuroojno oqqrssod 


Sapnjoxe 


ZEE 


? SI 

uoneijsnjn snoad əy} ur speau euroojno əy} o) pousse 
z Jəqumu əm) “əzoyəsəqL, *ÅMABQOId e pojpeo st ooeds 
ərdures v jo 1oquiour e 0j pousse s1oquinu ay} jo YILA 


"üonnqrgsip 599 == 

o[durexo ue st ooeds o[dures € jo s1oquiour 0) s1oquinu jo 
yuəuruƏrsse St, -: T S[enbo suequinu asou Jo [9303 Əv) 
yey} yons requinu v pouSrsse uooq sey ooeds opdures əy} 


UI S9WIODINO əu) JO YLA + 81183 ouroojno ay} 0} £ Ioquinu 


ay} pue Speeu auroo3no əy} o) ç Aequinu əy} uSrsse p[noo 
ƏM ‘arojerey} “ə[durexə ue se uroo € JO SSO) ouo 10] ooeds 
ərdures əy} SuIsQ -q s[enboe sequinu əsəu1 [Te jo wns 

əm yey} uəns eoeds ejdures z jo zəqurəur qəvə o) 1equinu 
eArjeSeu-uou € JO jueurudrss* əy} A[durrs ST uorjnqrijstp 
Aytiqeqord y -uopnqrunsrp Ajrmqeqoad jo yey} sr zəpisuoə 
ITEM an ÁroəuL Airtqeqodq Jo 3doouoo otseq 3xou ou], 


C: ur 91? səuroo1no Ə} FEY} pres 
ƏM ueqA JULIU ST 3EqA SI jeu], :1n220 ULI Seuroojno əv) 
Jo euo ÁTuo pue euo qem) əsoddns ppm om *o1ourreqjrnjg 


* SS9201d € yo 
TTE Jo 4SH e se eoeds ejdures e 3ururyəp Aq Azoəur, 
Ayırıqeqorq jo uondrrəsəp ino unseq savy ƏM ‘MAII OL 


“Səuroəşno e[qrssod omg Aue Jo o2uo11nooo 
əm” yə əuroə)no ouo Aue JO 92uo11n220 
94} SCOUTS əAysnrəxə Airenynur aq 03 pres are səuroəşno 
9[qrssod xis out :-pəzInəooo səuroono epqrssod ƏA1J 


(pənuruoo) 


'86 


“LE 


"96 


"sz 


"v6 


125. 


126. 


127. 


128. 


129. 


According to our previous discussion of the mean, you 
know the sum of the deviations from the mean must equal 
Therefore, the following formula 
could/ could not 


be true if x is the mean of these three observations: 


(10 - x) + (2- x) + (44-x) = 5 


If your data consisted of three observations and the 
following equation were true, 10 would be the 


of these three observations . 


Gu m 10) + X; s 10) + (X, - 10) - 0 


In other words, without actually stating the values of the 
observations nor the actual value of the mean, we know 
(from the definition of the mean) that tne following 
equation is for any collection of three 


observations. 
(x) - x) + (K - x) + (X,- x) = 0 


According to simple algebra, we remove 


could/ could not 


the parentheses in the previous equation and write it as 


follows: 


Furthermore, we could rearrange the symbols on the 
left-hand side of the previous equation so that the 


equation reads as follows: 


Ke pr XU = 0 
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130. Remember, the left-hand side of the previous equation 
is the sum of the deviations from the mean. Notice how 
this sum is made up of the total of all the scores from 


which we subtract the mean times. three 


three/four 


131. We can represent the total of the three observations by 
27 X. Also, subtracting the mean three times is the 
same as subtracting 3x. Therefore, we could write the 


previous equation as x = x= 0. 3 


132. The previous equation says that when we subtract 3x 
from the sum of the three observations, our answer is 
zero. Therefore, the total of the three observations 


Q, X) must be exactly equal to 3 ç x 
133. We started with an equation that said the sum of the 
deviations from the mean equals zero, and we have 
proceeded to the following equation: 
XX= 8x 
If we divide both sides of the previous equation by 3, 
the result would be 
DUO Ug 
3 3 
The 3's on the right-hand side of the equation cancel 
each other out, leaving us with the equation: 
m= Ë 
PLA SUP ıı x 


3 
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“ 
This equation is a formula for the mean, since it says 


that if we added together our three values and divided 
this sum by the number of observations (3), the result 


would equal the mean. 


Let' s see if this formula works for a particular example. 

Suppose your data consisted of the values 2, 8, and 2. 

The formula says to first add the observations. This 

would give you a sum of . Dividing this sum by 3 12 


would give you a result of 5 4 


In other words, 


Therefore, according to the formula, the mean of the 

three observations is . We can check to see whether 4 
4 really is the mean of the previous observations by 

considering the following table: 


DEVIATION 
OBSERVATION VALUE FROM 4 


1 
2 
3 


The sum of the deviations from the reference value 4 


equals . Therefore, we know that 4 is the zero, 


of the 3 observations. 
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136. The formula we derived was for a collection of three 
observations. Suppose the data consisted of 4 
observations. We could write the following equation, 
which specifies the relationship between the mean and 
those 4 observations: 
(xS x Xə 0 


197. VVe could remove the parentheses in the previous 
equation and rearrange the terms so that the equation 
reads as follovs: 


X, + X, + X, + X, - x - - Ë” 0 


E) 
E 


138. Using our shorthand way of writing this, we could say 


that: 
X-4x = 0 3: 


139. Remember, the sum of the deviations from the mean 
equals Dx - 3x when the data consists of 3 observations, 
and zə - 4x when the data consists of 4 observations. 
No matter how many observations there are in the 
collection of data, it is not hard to see that the following 


equation be true. would 
would/ would not 
yx - Nx = 0 


where N is the number of observations. 


140. Just as we did earlier, we could rearrange this equation 
to read: ))X = x. Then, dividing both sides of the N 
equation by N, we would finally arrive at the following 


formula for the mean: 
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(Continued) 


This formula says the mean of a collection of N 
observations is equal to the total of all of the observations 
divided by the of those observations. 


Thus, if you had a collection of 10 observations and the 
total of all the observations were 50, the mean of the 
observations would equal divided by e In 


other words, the mean would equal 


Let' s test this formula on the following collection of 


5 observations. 


DEVIATIONS 
OBSERVATION FROM 6 


Notice that); X = for this data (because ),X is 
a shorthand way of writing " sum of all the values"). 


ə equals 30, and N equals ;thus, our formula 
for the mean says to divide by , which 


would give a result of 


The formula says 6 is the mean of these five observations. 
In the third column of the table, we have listed the 
deviations of each observation from the reference value 

It is clear that the sum of these deviations equals 


This indicates that 6 the mean of 


is/ is not 


these 5 observations. 
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30 


30, 5 
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145. We have considered three ways of representing the central 


central tendency of a distribution: the m F 


m and the m o median, mean 


the mode 


146. Often a distribution will have a different mean, median, 
and mode. The data shown in the previous table, 


however, has a mode of , à median of , and 6, 6 
a mean of 
147. Six is the of the previous collection of data mode 


since this is the most frequently occurring value. Six 

is also the of the previous collection of median 
data since there is, in the collection of data, one value 

greater than six and one value less than six. Finally, 6 

is the of the previous collection of data mean 


because the sum of the deviations from 6 equals zero. 


148. The mean is a useful way of representing the central 
tendency of a distribution because its value depends on 
every value in the distribution. The value of the 
is determined only on the basis of mode 


mean/ median/ mode 


the most frequently occurring value in the collection of 
data. Finally, while you know that there are the same 


number of values above the as median 
mean/ median/ mode 
there are below it, you know how far above do not 


do/ do not 


or how far below the median these values are. Thus, 
each of the various ways of representing the central 
tendency of a distribution has its own peculiar features. 
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Section IV: Variability 


We have just seen that the mean, median, and mode are 
three that characterize the 
of a distribution. 


Besides their central tendency, there is another 
important characteristic of distributions, one that is not 
r epresented by either the mean, the median or the mode. 
Some collections of data are composed of many similar 
values, while in other distributions the values might vary 
considerably. For example, the values in Data 

ABIT 
(below) are all very similar to each other, while the 


values in Data are much more widely separated. 
B 


Data A: 20, 21, 20, 19, 20 
Data B: 2, 38, 20, 5, 35 


We have listed the two previous collections of data in 


the following tables: 


Observation Value Deviation From 20 


20 
21 
20 
19 
20 


TABLE A 
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(Continued) 


Observation Deviation From 20 


-18 
418 
0 
=15 
+15 


TABLE B 


Notice that the tables indicate the deviation of each value 
from the reference value 


The sum of the deviations in Table A is 
Therefore, the mean of Data A is : 


The sum of the deviations in Table B is ; therefore, 
the mean of Data B and the mean of Data A are 


the same/ different 


While the means of both distributions are identical, 

there is an interesting difference in the two distributions. 
If we ignore whether or not a deviation is positive or 
negative and only consider its absolute size, it is clear 


that the deviations in Table tend to be larger 
A/B 


than the deviations in the other table. In other words, 


the values in Table are more spread out 
A/ B 


(dispersed) around the mean value of 20. 
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The manner in which the deviations from the mean are 
different in the two previous collections of data can be 


illustrated by the following graphs of raw data. 


40 40 
Q9 Le 
S 5 
x 20 x7 op S 
= zk $ d ə: 
10 10 
0 0 
12345 12345 
OBSERVATION OBSERVATION 
GRAPH A GRAPH B 


We have represented deviations from the mean with 
arrows, as we have done previously. 1f we ignore the 
direction in which an arrow is pointing and only consider 
its length, it is clear that the deviations from the mean 


in Data are larger than those of the other 
A/B 


collection. 


Notice how the heights of the columns in Graph 
A/B 


are very similar, whereas the heights of the columns in 
the other graph tend to change or vary much more from 


one observation to another. 


Since the height of a column represents the value of that 


observation, we could say that the values on Graph 
A/B 


change or vary more than do those on the other graph. 
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10. 


11. 


12. 


The more the values in a collection of data vary, the 
more variability that collection of data is said to have. 


Thus, we would say that the variability of Data 
A/B 


was greater than the variability of the other collection 
of data. 


Data composed of many widely different values is said 
to have a great deal of variability. On the other hand, a 
collection of data in which the values are very similar 
or close together could be described as having little 
variability. Therefore, of the following two collections 


' of data, Data would be described as having more 


A/B 
variability than the other collection. 


Data A: 10, 9, 10, 10, 11 


Data B: 2, 18, 10, 1, 19 


We could illustrate the difference in variability of the 
two previous distributions with the following graphs: 


20 20 
H 10 d 10 
< 
S : 
0 0 
12345 12345 
OBSERVATION OBSERVATION 
GRAPH À GRAPH B 


Earlier we defined a variable as something that changed 
or varied. The more something changes or varies, the 
more variable it is said to be. Thus, the observed 


values are more variable in Graph than in the 
A/B 


other graph. 
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13. 


14. 


15. 


16. 


The more variable the values in a collection of data, 
the more varlability that collection is said to have. 
In other words, the more the values in a collection of 


data change or vary from observation to observation the 


more that collection is said to have. 


If all the values in a collection of data were very similar, 

the data would not have much variability. If all the values 
are very similar, you would expect the difference between 
the largest value and the smallest value to be 


In other words, if the difference between the largest and 
smallest value in a collection of data were very small, 
the data have much variability. 


would/ would not 


We often use the difference between the largest and 
smallest value in a collection of data as a statistic 
representing the variability of that data. We call this 
statistic the range. In other words, you could represent 
the variability of a collection of data by finding the 
difference between the largest and the smallest value in 
that collection of data. This difference is a statistic 


called the 


If your data consisted of the values 100, 50, 10, 75, and 
80, then the of these values would be 90, 
since this value is the difference between the highest 
value (100) and the lowest observed value (10). 
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17. 


18. 


The relation between the range and what we have referred 
to as the variability of a distribution can be illustrated 
on the following frequency distribution graphs: 


5 5 
4 
p: > i 
o o 
Z Z 
B ds 
d S 
2 2 
É Ë 
1 1 
0 0 
123 45 6 123 45 6 
VALUE VALUE 
GRAPH A GRAPH B 


The data shown on both of these graphs have the same 
mean, but the data on Graph B is more yariable than the 
data on the other graph. 


Notice that the difference between the highest and lowest 
observed values on Graph A equals minus 


The difference between the highest and lowest value on 


Graph B equals minus 


Since the range of a collection of data is the difference 
between the highest and lowest values, the range of the 
data on Graph A is and on Graph B, ; 
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19. 


20. 


21. 


22. 


Notice that the distribution in which the values are spread 


farthest from the mean has the largest range. In other 
words, the distribution which is most v 
would tend to have the largest range. 


Suppose the largest value in a collection of data were 100 
and the smallest value were 10. The range of that 
collection of data would be , Since this equals 


minus 10. 


Instead of considering frequency distribution graphs, let's 


consider how a graph of raw data indicates the range. 
The range of a collection of data is immediately obvious 
in a graph of raw data, since the range is the 

in height of the column representing the smallest 
observed value and the height of the column representing 
the largest observed value. 


The variability of a collection of data does not depend on 
the size of the values in the collection. It only depends 


on the differences in the sizes of the values. For example, 


consider the two graphs of raw data shown below. (Do 
not confuse these graphs with frequency graphs.) 
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OBSERVATION OBSERVATION 
GRAPH A GRAPH B 
Notice that the values on Graph are all larger than 


A7 B 
those on the other graph. However, the variability of 


the data on Graph is greater than the variability 
A/B 


of the other collection of data. ino 


variable 


90, 100 
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23. 


24. 


25. 


The range of the data in Graph would be found 
A/B 


by subtracting the value of observation 3 from the value 
of observation 4. 


The range of a collection of data can also be determined 
from a frequency table by finding the difference between 
the smallest value having a frequency greater than 0 and 
the largest value having a frequency greater than 0. In 
the following frequency table, for example, the largest 
value having a frequency greater than 0 is and the 


smallest value having a frequency greater than 0 is . 


Therefore, the range of this distribution is . The 


value 13 was a possible value which 


was/ was not 
observed. This is why we only consider values with 


frequencies greater than zero. 


10 


11 
12 
13 


The range of the distribution shown in the following 
frequency distribution graph is Since the smallest 
observed value is and the largest observed value 


is 


FREQUENCY 


VALUE 
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26. The range can sometimes give a misleading picture 
of the variability of a distribution. For example, 


consider the following two frequency distribution graphs. 


GRAPH A 


FREQUENCY 


uru adımı =o OST 
VALUE 


GRAPH B 


FREQUENCY 
CO rm m co 4 c 


12: 29 4.5116 5 9.10 
VALUE 


Notice that the range of distribution in Graph À is 
and the range oí distribution in Graph B is 


Notice that all the values except one are identical in 


Graph whereas the values in the other graph were 
A/B 


all different. Therefore, even though both distributions 


have the same range, the distribution in Graph 
A/B 


appears to be more variable than does the other 


distribution. 
The problem in representing the variability of a 


distribution with the range is that it is determined solely 


by the largest and smallest observed values. 
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27. 


28. 


29. 


Earlier, we pointed out how the variability of a 
collection of data is related to the deviations of the 
values from their mean. For example, consider the two 
graphs of raw data shown below: 


20 20 
15 15 
B 10 = -X 10 Fr 
3 B ; 
< < 
ED = 5 
0 
1 2259941556 L 2 854 5 6 
OBSERVATION OBSERVATION 
GRAPH A GRAPH B 


We have shown deviations from the mean value on both 
of these graphs. Thus, the mean of both distributions 
is since it equals 10 on Graph A and 


the same/ different 


10 on Graph B. 


While we know that the sum of the deviations from 10 
equals in both distributions, there is, nevertheless, 


an important difference between the two distributions. 


It is apparent that the deviations from 10 tend to be 


larger for the data shown on Graph than for the 
A/B 


data in the other graph. 


You could describe the differences between the two 
distributions by saying that the value of each observation 


tended to vary or change more on Graph than on 
A/B 


the other graph. 
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31. 


32. 


Data containing many widely separated values of a 
variable is said to have more variability than data 
containing very similar values. Thus, of the two 


previous distributions, the Distribution in Graph 
A/B 


c ould be described as having the most variability. 


In other words, we could say that the two distributions 
Shown in Frame 27 are similar in terms of their 


and different in terms of 
central tendencies/ variability 


their . 


central tendencies/ variability 


The relationship between the variability of a collection 
of data and the size of the deviations from the mean is 
illustrated by the following two tables. 


15 
8 
12 
5 


10 
11 
10 


TABLE B 


Notice that the deviation of observation 4 from the mean 
in Table A was obtained by subtracting from d 


to yield an answer of 
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33. 


34. 


35. 


36. 


37. 


Notice that the sum of the deviations in Table A equals 


and the sum of the deviations in Table B equals 0 


This indicates that the of both 0, mean 
distributions is 10. 


It is clear that the values tend to be farther away from 


the mean in Table than they do in the other table. A 
A/B 


Without considering the deviations from the mean, it is 
apparent that the value of the observations changed or 


varied more in Table than they did in the other A 
A/B 
table. Therefore, the of the data variability 


in Table A appears to be greater than the variability of 
the data in the other table. 


We could represent the difference in variability by 
calculating the range of each collection of data. The 


range of the data in Table A would be , Since this 10 
equals minus 15, 5 
Since the range of the data in Table B is , the 2 
range of the data in Table A is than the greater 


greater/ smaller 


range of the data in Table B. 


The range is not the only statistic you can use to represent 
the variability of a distribution. There is another way 

of characterizing the difference in variability of the 

two previous collections of data. Notice that if we ignore 


whether a deviation is positive or negative, the 


deviations in Table tend to be larger than those in A 
A/B 


the other table. 
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(Continued) 


You can think of the variability of a collection of data 

as the degree to which the values are spread out from the 
mean. In other words, if all of the values in a collection 
of data are very similar, they will all be clustered very 
close to the mean and the data will not have much 
variability. If the values in a collection of data are 
widely dispersed (spread out), the deviations from the 
mean will tend to be very and the 


large/ small 


distribution could be described as having a great deal 
of A 


We have seen several illustrations in which two collections 
of data have the same mean but different variability. It 

is also possible for two collections of data to have 
different means and the same variability. For example, 
consider the two graphs shown below. Notice that the 


mean of the data in Graph is smaller than the 
A/B 
mean of the other collection of data. 
40 403. < -X 
30 30 
20 20 
101- = -X 10 
Une Onion 5 
OBSERVATION OBSERVATION 
GRAPH A GRAPH B 


However, the variability of the two collections of data 


(shown above) about their respective means - 
is/is not 


almost identical. Thus, the two distributions could be 


described as having similar but 
variability/ means 
different . 
variability/ means 166 
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41. 


42. 


Therefore, simply knowing that two collections of 
data are similar in terms of their central tendencies 


indicate whether they are similar in terms 
does/ does not 


of their variability. 


Another illustration of the lack of any relationship 
between the mean of a distribution and its variability is 
given by the following two tables of raw data: 


TABLE A 


1 10 
2 8 
3 6 


TABLE B 


Notice that the data in each table consists of 


observations of a variable. 


numerical/ non-numerical 


The particular reference value from vhich the 
deviations are calculated in each of the previous tables 
is the of that collection of data, since the 
sum of the deviations in each table equals . Thus, 
the mean of the data in Table A is ), whereas the 


mean of the data in the Table B is 
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46. 


We could represent the variability of each collection 
of data by its range. The range of the data in Table A 
is and the range of the data in Table B is j 


Earlier, we pointed out how the variability of a 
collection of data could be thought of as the degree to 
which the observed values were spread out or dispersed 
about the mean of that collection of data. The 

is a useful measure of variability because it is the 
difference between the value having greatest positive 
deviation from the mean and the value having greatest 


negative deviation from the mean. 


We also noted earlier that the range is not à completely 
satisfactory way of representing variability. Two 
distributions may have identical ranges and yet one 
distribution may appear to be much more variable than 
the other. For example, consider the two collections 


of data shown in the following tables: 


15 
5 
12 
7 
8 


DATA # A DATA # B 
The smallest value in each table is and the 
largest value is . Notice that the range of both 


collections of data equals ¿ 


While the range is the same in both collections of data, 


all the values, except one, are identical in Data e 
: A/B 
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49. 


This illustration points up one of the disadvantages of 
the range as a way of representing variability. The 
range is determined by only two of the observed values 
in the collection of data: 


1) the observed value, and 
2) the observed value. 


Since the largest observed value in each table is 
and the smallest observed value is , the collections 
of data shown in the following two tables have 


ranges. 


the same/ different 
1 


2 
3 
4 
5 
6 1 


TABLE A 


-4 


1 
1 -4 
1 -4 
8 
9 

10 


TABLE B 


We have listed the deviations of the values in each table 
from their common mean of . The largest and 
smallest values in both collections of data are the same. 


However, the other observed values in Table are all 
A/B 


very close (or identical) to the mean. All the values in 
the other distribution are almost as far away from the 


mean as are the two extreme values. 10 


largest 


smallest 


10 
1 


the same 
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50. 


51. 


Even though the two extreme deviations in both 
collections of data are the same, the typical size of a 


deviation in Data is greater than in the other 
A/B 


collection of data. This feature can be illustrated by the 
following graphs of raw data: 
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al DATAA 
B 1 

6 
= GE " 
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1 2 3 4 5 6 
OBSERVATION 


VALUE 


OBSERVATION 


We have indicated the deviations from the mean in both 
collections of data with arrows, just as we have done 
previously. The lengths of these arrows indicate how 


most of the observed values in Data are very 
A/B 


close to the mean, whereas the values in the other graph 
tend to be farther away from the mean. 


We have already seen that the mean of a collection of 
values can be thought of as the typical value. Therefore, 
one way we might represent the typical size of a deviation 
would be to find the mean, or the average of the deviations. 


However, our formula for finding the mean of a collection 
of values (2, X/ N) says that the first thing to do is to add 
together all the values. No matter what the variability of 
a distribution, the sum of the deviations from the mean 


would always equal e 
ways equal — . 170 
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52. 


53. 


54. 


Statisticians have found it useful to represent the 
variability of a distribution by the typical or average 
size of the squared deviations from the mean. Whenever 
you square a number, your answer will be positive, 
whether the number you are squaring is positive or 
negative. 


For example, 2? =I 22 4 


Similarly, (-2)? = (202-2 |. 


Therefore, whether a deviation is positive or negative, 


when you square it, your answer will be . 
positive/ negative 
In the table of data shown below, we have listed 3 
observed values, their deviations from the mean, and 

the square of these deviations. Thus, Xx, has a value of 
_ and a deviation from the mean of "Ehe 
square of that deviation is times , which 


equals 


If you add (sum) all 3 observed values, you obtain the 


total of the observed values, which is . Dividing 
this total by indicates that the mean of these three 
values is 


Adding up the deviations from the mean value of 5 will 
naturally yield an answer of , because the mean is 
defined as that particular reference value from which 


the deviations sum to 
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Observation Deviation from 5 | Squared Deviation 
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55. 


56. 


57. 


58. 


To find the mean of a group of values, you first find 
their total, and then divide this total by the number of 
values. 1f you wanted to find the mean (average) 
Squared deviation, you could first add all the squared 
deviations and then divide by the number of squared 
deviations. In our previous example (see the table in 
Frame 53), the total of the number of squared deviations 


equals plus plus , which equals 


Since there were 3 observations in the collection of data, 
you should divide the total of the squared deviations by 
to find the mean of these squared deviations. 


Thus, the mean (average) of the squared deviations 
equals divided by ; which yields the 


answer of 


Statisticians refer to the average of the squared 
deviations as the variance. Therefore, the variance of 


the previous collection of data is 


The variance is a statistic representing the variability of 
a collection of data. Values that are widely dispersed 
(spread out from the mean) have large deviations from 
the mean. Whether these deviations are positive or 
negative, they will result in large squared deviations. 
Therefore, saying that a collection has a large variance 


implies that the values tend to be 
spread out from/ close to 


the mean. 


A collection of data having observations of all the same 


value has the possible variability. 


most/ least 
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9, 9, 0, 18 


18, 3 


spread out frc 


least 
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59. 


60. 


61. 


62. 


Suppose all the observations in a collection of data 
had the value 10. The mean of that collection of data 
would equal and each observation would have a 
deviation from the mean equal to 


If each deviation equaled zero, each deviation squared 
would also equalzero. Since the variance of a collection 
of data is the typical size of the squared deviations, the 
variance of this collection of data would equal 


A collection of data having the least possible variability 
would have, therefore, a variance equal to - 


would also have a range equal to D 


Consider the collection of data shown below: 


The mean of this collection of data is 6. We have left 
room in the third column of the table to insert the deviation 
of each observation from the mean. The deviation of 
observation 1 from the mean equals minus 5 


which equals 


Squaring the deviation of observation 1 from the mean 


equals times , which equals 
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63. 


64. 


65. 


66. 


Deviation From 6 | Sq. of the Deviation 


Observation Deviation from 10 | Sq. of Deviation 


We could find the deviation and the square of that 
deviation in a similiar manner for each of the other 
observed values and summarize our work in the following 
table: 


2 4 
ET 1 
ET 1 

0 0 


To find the average of the squared deviations (the 
variance), we would all of the squared 


deviations and divide this total by e 


Therefore, the variance of the previous collection of 
data equals divided by . Thus, 1.5 is the 
of the previous collection of data. 


The variance of the data in the following table equals 
divided by . The variance, therefore, 


15 
5 
5 

15 


Since the variance is the of the squared 
deviations from the mean, the formula for the variance 


will be similar in some ways to the formula for the mean 


we considered earlier. 


To find the mean of a group of values, we first sum all 
the values and then divide this sum by the 


of values. 
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70. 


(Continued) 


In a similar manner, to find the mean of a group of 
squared deviations, we first sum all the squared 
deviations and then divide this sum by the number of 
squared deviations. In other words, the total of all the 
squared deviations divided by the number of deviations 
equals the mean of the squared deviations, which 
particular mean we call the 


Representing the mean of a group of values by x and the 
value of a particular observation by X, we would 
represent the deviation of that observation from the 
mean as (X- ). The expression (X - 9? would 
represent the square of the previous deviation. 


If your collection of data consisted of three observations, 
you could represent the of these three 


observations as: 


X, + KX, +X 


1 2 3 


Similarly, you could represent the sum of the 
of these three observations from their mean as: 


(K,- 3) + On - 2) + (x, - x) 


Finally, you could represent the sum of the 


deviations as: 


(x, - x). G - 92 Da: d 


Just as we represented the sum of all the raw scores by 
> X, we could represent the of the deviations 


from the mean by ex - x). 
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Séi 


sum (total) 


deviations 
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sum 
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Regardless of what collection of data we are describing, 
however, we know that x - x) will always equal 


,according to our definition of the mean. 


Using the symbol (the capital Greek letter ) 
in the same way as we did when we wrote » X and 

Her - x), we could represent the total of all the squared 
deviations by — — (X- 37. 


Since the variance is the mean of the squared deviations, 
we could represent the of a collection 


of data as: 


ye - gi 


N 


Statisticians use the symbol g^ to represent the variance. 


Using this symbol, a formula for the variance could be 


written as follows: 2 
LX -3 
N 


The symbol o is the uncapitalized form of the Greek 
letter sigma. The summation symbol was the 
capital Greek letter sigma. For this reason, the 
variance, represented by c 2 is often referred to as 


S ORAT sə squared. 
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AT. 


We now have defined two formulas. The first formula 
is: 

BE OM 

x = X 


where x (called the ) is a statistic 


representing the c t 


of a distribution. 


The second formula is: 


o us eost 
N 


» 


where ” (called the v ) is a statistic 


representing the of a distribution. 


Let" s try using the formula for both the mean and the 
variance on the collection of data shovn in the folloving 
table. 


1 
2 
3 


Notice that we have added two extra columns to a table 
of raw data. In the third column of the table, we could 
list the of each observation 


from the mean. In the fourth column we could list the 
of each deviation from the mean. 


Before you can calculate the deviation of each value from 
the mean, you must first calculate the mean itself. 

Since the formula for the mean is , you must 
first find the of all three values and then 


divide this result by (since N = 3). 
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variance 


variability 


deviation 


square 
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sum (total) 
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78. The sum of the observed values is . Dividing 
this sum by 3, you find the mean to be 


79. Accordingly, you could replace the headings on columns 
3 and 4 in the previous table as follows: 


1 
2 
3 


Note that here we have replaced found in the 


earlier table with its actual value, n 


Notice that since the mean is 5, the deviation of 
observation 1 from the mean equals , and the 
square of this deviation equals ° 


The deviation of observation 2 equals , and the 
square of this deviation equals 


Finally, the deviation of observation 3 equals 5 
and the square of this deviation equals D 


80. We have summarized these answers in the following 


table: 


2 


1 zı 1 
2 
3 
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(Continued) 


Remember, the formula for the variance, 


ac» 
N , 


says to sum all of the squared deviations and then to 

this sum by the number of observations. divide 
The variance of the data in the previous table, therefore, 
equals y divided by _ . This means on EM 2, 3, 5 
for this collection of three observed values. 


Suppose the deviation from the mean of every value in a 
collection of data equaled 2 or -2. The square of each 
of the positive deviations would equal 2 times 2, or 4, 
and the square of every negative deviation vvould equal 


times , Which would also equal 4. 2, -2 


Since the square of every deviation from the mean would 
be 4, the typical or mean deviation would equal 4. We 
would, therefore, say that the variance of the distribution 


was 


Find the error in the following table. 


Observation 


Notice that you would correct this error by changing 
-5, 45 (or simply 5) 


+5 


to , since the deviation of 10 from the 


reference value 5 is 
+5 / -5 
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83. 


84. 


85. 


86. 


87. 


Since the sum of the squared deviations equals 

and since there are 3 observations, the variance of 
the previous data would equal — devided by . 
Therefore, e = 5 BE 


If your data consisted of the values 3 and 7, the mean 
would be - The variance, therefore, would 
equal divided by - In other words, 


Since x = A, the mean of 3 and 7 would equal = 7 


E Ze, 22 
Similiarly, if the mean equals 5, would equal 


2 2 N 
3 - 5) + 7 - 5) , which equals 
əya a ayaz 

2 mig Oat RES 


Since the variance can be thought of as the typical size 

of a squared deviation, statisticians have found it useful 
to assign a special name to the square root of the 
variance. They call the square root of the variance the 
standard deviation. Therefore, the standard deviation 

of a distribution is a deviation which, if squared, would 
equal the (even if none of the observed 


values actually has this particular deviation). 
If the variance of your data were 9, an observation which 


had a positive deviation from the mean of would 


have a squared deviation equal to the variance. 
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88. 


89. 


90. 


91. 


In other words, if the variance of your data were 9, 


the standard deviation would be , Since (3? = 9. 


H the variance of your data were 25, the standard 
deviation would equal , Since ( e 825. 


. Just as a group of values may have a mean that is not 


equal to any of the values, it is not necessary for any 
particular value in a collection of data to have a 
deviation from the mean exactly equal to a standard 
deviation. Therefore, whereas the standard deviation 
can be thought of as the typical size of a deviation in 


same collection of data, it necessary for any 
is/ is not 


value in the data to actually have this deviation. 


To summarize, the variance represented by the symbol 
is the typical or average squared deviation. 

Any squared deviation equal to this average squared 

deviation would be called the 

Remember, it necessary that any value in 


is/ is not 


the data actually have this particular deviation. 


The variance is equal to the square of the standard 
deviation. Therefore, the standard deviation is equal to 
the of the variance. 


Since the variance is represented by e. the ib 
deviation could be represented by gə, But c is 
simply o. Thus, the variance is represented by 
o/o 
d the standard deviation by e 
an ; 


0/0 
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is not 


standard deviation 


is not 


square root 


86c 


*13u93-auo 13sərəeəu au 
0} go pəpunor suorjrodoid epdureg y 


ÇO =— Q ç? «t i0 cO t= ADO 


.5— 


uoryrodoadq 
eqpdureg 


"g: spenbo d 


941euA uonerndod əörey e woaz uropuea 
ye uMvIp So[dures 001 Jo Hong st 
V H'ISgV L 
(" əəuəruəAuoə 


INOA 10} Zeta uñous əlqu) əm) yeodor oA ‘ureSe əəuo) 


ep ucu) 
1991$ ojeurnjso yo 10119 ejnposqe ue Surxeui yo gert 


9S€9JIOUI /eonpet 


əənpər eu 0} pereedde “rojərəu) “G o) əzts 
ordures ay} Surseodoug *T JO suorjy1odoxd əldures uşu 
əldures ouo pue 0 jo suory1odoad ərdures uşru sopdures 
Om} ATuo are əsən) “G əzis Jo so[dures 001 am Suoury 


"əörel yen) 10119 uoneumnsə 

e Ue 0} Dot SAY p[noA ər g 9ZIS Jo So[dures — —— Aquo 
9194 ədən) “Z əzis jo sə[dures ay} uyu ş- uey} zəşcəs3 

$10.10 uorjeurrjso ojnjosqe gp opeur oAeu pinoA nos ƏM 

"en[eA uorjerndod əy} o) əsorə Áyəqd 9q prnoA səşeumsə 

jsour “rəşəurered uoryeyndod 913 Jo əşeumisə ?əəycəd 


(pənunuoo) 


“961 


"Gët 


"V6T 


92. 


93. 


94. 


95. 


96. 


By using d^ to represent the and o to 
represent the ; 
we emphasize the fact that the variance is simply the 
of the standard deviation. In other 
words, we emphasize that the 
of the variance equals the standard deviation by letting 
represent the standard deviation and 


represent the variance. 


Suppose all the values in a collection of data deviated 
from the mean by either +3 or -3. The variance of that 
collection of data would equal and the standard 


deviation would equal 


In other words, zə equals and o equals . 
9/3 9/3 


We have considered three statistics used to represent 
the central tendency of a collection of data. These three 
statistics are the , the P 
and the g 


The is the most frequently occurring value 
in a collection of data. The is a value 
which would divide a list of the ranked data into two 
equal parts. The is that particular 
reference value from which the sum of the deviations 


equals 0. 


We have also considered three statistics used to represent 


the variability or dispersion of a collection of data. They 


are theu 0 N sx sn and its 


square root, called the 
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range, variance 
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97. 


98. 


99. 


100. 


The is the difference between the 
largest and smallest observed value. The 
is the typical or average of the squared deviations from 


the mean of that collection of data. The standard 
deviation is a deviation which when squared will equal 
the 5 


We have also found a way to write rules for calculating 
the mean and the variance. To find the typical value or 
mean, we all the values and 

by the number of values. The formula for the mean is 


written, therefore, as: 


Sp 


To find the typical squared deviation (the variance), we 
sum all of the and 
divide by the number of values. The formula for the 


variance, therefore, is written as: 


2 
osas 


The standard deviation is simply the 
of this variance. 


The statistics which describe the 
of the distribution represent in one 


way or another the typical value to be found in that 
distribution. A statistic representing the 

of the distribution describes the degree to which the 
observed values are spread out or dispersed. 
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range 
variance 


variance 


add (sum), divide 


squared deviations 


ox - x)? 
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square root 
central 
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101. 


Statistics describing central tendency do not tell you 
anything about the variability of a distribution. 
Statistics describing variability do not tell you anything 
about centraltendency. For example, consider the four 


distributions shown in the following graphs. 


100 
5 DISTRIBUTION A 
Z 
Sen 
Q 
m 
m 
= 0 

1234567 8 9 1011 12131415 
VALUE 
Gë DISTRIBUTION B 
e; 
B 50 
Q 
m 
m 
AP 
123456798 9101112131415 
VALUE 
100 DISTRIBUTION C 


FREQUENCY 
e 
o 


0123456789 10112131415 
VALUE 


100 DISTRIBUTION D 


FREQUENCY 
e 
o 


0-T2345078910112131415 
VALUE 


Notice that distributions A and B are similar in terms of 


İheir tes 
central tendency/ variability 


terms of their 


yet quite different in 


central tendency/ variability 
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central tendency 
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102. On the other hand, Distribution A is similar to 


Distribution C in terms of 5 variability 
central tendency/ variability 


whereas the mean of Distribution A appears to be 
the mean of Distribution C. larger than 


larger than/ equal to 


103. Distribution B and D are similar in terms of 
` variability 


central tendency/ variability 


104. The choice of one statistic over another to represent 
same feature of a distribution depends upon the particular 
feature you wish to represent. For example, consider 
the distribution shown below. Except for a few extreme 
(unusually large) values, most of the values are grouped 


around the value D 2 
2/5 


40 
30 
20 
10 


FREQUENCY 


0123456789112 
VALUE 


If you represented the central tendency of the preceding 

data by the mean, the extreme values would tend to pull 

the mean away from the value 2. On the other hand, if 

you used the mode to represent the central tendency of 

the distribution, its value would be ` `, In other 2 
words, the mode might be a better way of representing 

the central tendency in terms of this unusual distribution 


since the mode influenced by the few is not 
is / is not 


extreme values found in the distribution. 
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REVIEW II 
FILL IN THE BLANKS: 


1. The most frequently occurring value in the distribution 
is called the S mode 


2. NUMBER OF ERRORS 


3 
6 
2 
3 


In the table above, pick out the modal number of 3 
errors. 
3. A value in a collection of data which is smaller than half 
of the other observed values and larger than the 
remaining values is called the . median 
4. Find the median in the following collection of data: 
14, 11, 8, 4, 2 8 
5. Find the median in the following set of numbers: 
9,78, 6.4 T 
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MULTIPLE CHOICE: 


6. It is possible for two collections of data to have 
different means and: 


a. radial variability. 
b. diametric variability. 
c. the same variability. 


d. none of the above 


iri The degree to which the observed values are spread out 
or are dispersed from the mean of a collection of 


data can be referred to as the: 
a. mean. 
b. variability of that collection of data. 
c. median. 


d. none of the above 


The average of the squared deviations from the mean 
is called the: 

a. standard deviation. 

b. variance. 


c. mean. 


d. none of the above 


The square root of the variance is called the: 
a. mean. 
b. median. 
c. standard deviation. 


d. none of the above 
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10. If the variance of your data were 36, the standard 
deviation would be: 


a. “$, 
pera 
ce. A: 


d. none of the above 


11: The symbol representing the standard deviation is: 


d. none of the above 


TRUE OR FALSE: 


12. 

a negative deviation from the reference value. 

13. Let us assume that the reference value is 8 and the 
observation is 6. In this case, the deviation is 2. 

14. If we add all the deviations of a group of observed values 
from a particular reference value, and the answer is 
zero, that reference value is the mean of those values. 

15. The difference between the largest and smallest value 


in a collection of data is called the median. 
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TT: 


H your data consisted of the values 73, 22, 14, 91, 
and 11, the range would be 47. 


It is possible for two collections of data to have the 
Same mean, but different variability. 
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Section V: Types of Distributions 


In addition to describing the central tendency and 
variability of a distribution, it is often useful to say 
something else about the general shape of the distribution. 
For example, consider first the distribution shown in 
Figure A below. Now look at Figure Aİ m Figure A, 

we have shown how half of Figure A would look reflected 


in a mirror. 


MIRROR 
20 20 
Š Š 
á a 
e 10 0 
g zül 
5 E 
0 0 
VALUE 
FIGURE A FIGURE A! 


Notice how the reflection of the left half of the 
distribution has exactly the same shape as the right 
half of the distribution. (The right half is behind the 
mirror in Figure Al .) In other words, if we cut the 
distribution shown in Figure A in half along the line 
where we placed the mirror in Figure Aİ, we would 
have divided the distribution into two parts with the 


same shape but facing in opposite directions. 
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(Continued) 


In comparison, consider the distribution shown in 
Figure B below. In Figure Bİ the reflection in the 


mirror have the same shape as that 
does/ does not 


part of the distribution behind the mirror. 


MIRROR 


20 20 


FREQUENCY 
m 
° 
FREQUENCY 
= 
e 


VALUE VALUE 
FIGURE B FIGURE Bİ 


Therefore, of the previous distributions A and B, 


Distribution can be divided in half so that the 
A/B 


right half looks like a" mirror image" of the left half, 
whereas it would not be possible to divide the other 


distribution in such a manner. 
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A distribution which can be divided so that its left 
half appears to be a mirror image of its right half is 
called a symmetrical distribution. If a distribution 
cannot be so divided, it is called an asymmetrical 
distribution. Thus, of the two distributions shown 


below, Distribution is symmetrical, whereas the 
A/B 


other distribution is 


20 20 
p p 
o 
2 
° 10 2 10 
El El 
m m 
= = 

0 0 

VALUE VALUE 
DISTRIBUTION A DISTRIBUTION B 


Distribution A would be called symmetrical since you 
could place a mirror so that the reflection in the mirror 
had exactly the same shape as that part of the distribution 
hidden behind the mirror. (See the illustration below.) 
In other words, the right half of the distribution is a 
"mirror image" of the left half of the distribution. 


FREQUENCY 


DISTRIBUTION A! 
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On the other hand, the reflection of the left half of 


Distribution B (Distribution Bİ) have the 
does/ does not 


same shape as that part of the distribution hidden 
behind the mirror. We would say, therefore, that 
Distribution B was 


symmetrical/ asymmetrical 
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DISTRIBUTION B DISTRIBUTION Bİ 


Of the four distributions shown below, Distributions 
and would be described as symmetrical, 
while the other two distributions would be described as 


asymmetrical. 
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In some distributions, the scores seem to be piled 
up towards one end of the distribution. For example, 


in Distribution (shown below) most of the observed A 
A/B 


values occurred near the low-valued end of the 
distribution, with only a few values occurring near the 


upper end of the distribution. 
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The scores in Distribution B tend to be piled up near 


the values, with only a few observed values high 
high/ low 
down near the low end of the distribution. 


Both of the distributions shown above would be referred 
to as distributions. asymmetrical 


asymmetrical/ symmetrical 


Asymmetrical distributions in which most of the scores 
are piled up near one end could be regarded as 


"lopsided ." 
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(Continued) 


Distributions of this sort are said to be skewed. Ifa 


skewed distribution has most of the observations piled up 


near its low values, the distribution is said to be 
positively skewed. If the distribution has 

most of the observations piled up near the high values, 
the distribution is said to be negatively skewed. Thus, 
of the three distributions shown below, Distribution ` ` 
is positively skewed, whereas Distribution — is 
negatively skewed. Distribution _, however, is 
neither positively nor negatively skewed, since this 


distribution is symmetrical. 
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10. 


Remember that a distribution is 
Skewed/ symmetrical 


regarded as "lopsided." A 
skewed/ symmetrical 


distribution, however, could be regarded as perfectly 
balanced at its center, so balanced that its left half is a 
"mirror image' of its right half. 


We can think of any collection of data as a record of the 
observed of a variable. 


Any collection of data can be described in terms of the 
various frequencies with which the different values occur 
inthe data. This group of frequencies is referred to as 
the of the data. 


Often the difference between two distributions can be 
made apparent by drawing a picture of the distribution in 
the form of a frequency graph. Graphs of this sort make 
the "shape" of the distribution apparent. 


For example, we could describe Distribution (shown 
A/B 
below) as a symmetrical distribution and Distribution 
asa skewed distribution. 
A/B positively/negatively 
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11. 


12. 


13. 


You would describe Distribution À (above) as symmetrical 


since its left half is simply the reverse of its right 
half. In other words, the left half of Distribution A 
mirrors its right half. Distribution B, however, is not 
symmetrical. Distribution B is "lopsided," with most 
of the observed values piled up near one end. Lopsided 
or skewed distributions are described as 

positively skewed when the piling up occurs near the 


-valued end of the distribution, and negatively 


high/low 


skewed when the piling up occurs near the z 
high/ low 


valued end. 


FREQUENCY 


VALUE 


Most of the observed values in the above graph are 


"piled up" near the of the distribution, with 
ends/ center 
very few observed values near the of the 


ends/ center 


distribution. 


The previous distribution, however, 


would/ would not 


be described as symmetrical since its left side 
mirror its right side. 


does/ does not 
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14. 


15. 


16. 


As a psychologist, you will encounter distributions with 
many different shapes. You willfind, however, that 
certain types of distributions are encountered more 
often than others. For example, a very common type 
of distribution is one in which most of the values are 
piled up near the mean, with fewer and fewer values 


occurring farther from the mean. 


In other words, values similar to the mean would have 


the frequencies, whereas values larger 
larger/smaller 
farther away from the mean would have smaller 
larger/ smaller 
frequencies. 
Distribution (below) would be an example of the A 
A/B 
common type of distribution we just described. 
> > 
o 
d d 
Ea fa 
VALUE VALUE 
DISTRIBUTION A DISTRIBUTION B 


This commonly encountered type of distribution is often 
described as "bell-shaped" because its shape is similar 


to a bell's. 
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Although the shape of the previous distribution is not 
identicalto the shape of a bell, it is convenient to 
describe this kind of distribution as approximately "bell- 
shaped." You could describe the difference between the 


two distributions below by saying that Distribution A 
A/B 

is approximately "bell-shaped," whereas Distribution 

is not. B 
A/B 
p 
Š Š 
Z Z 
El El 
=) 2 
Q Q 
É Ë 
fa = 

VALUE VALUE 
DISTRIBUTION A DISTRIBUTION B 


Let's consider an example of a "bell-shaped" distribution. 
Suppose you filled a glass jar with 20 marbles and then 
asked a large number of students to estimate how many 
marbles there were in the jar. Some of the estimates 
would be too high and others would be too low. You would 
expect, however, that most of the estimates would be 
fairly close to the actual number of marbles in the jar. 
Occasionally, you would obtain some poor estimates, such 


as 15 or 25. On the other hand, you would expect 
estimates near to be more frequent twenty 


twenty/ twenty-five 
than estimates near R 
twenty/ twenty -five 


twenty-five 
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19. You would probably find the distribution of these 
estimates similar to which of the distributions shown 


below? 
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20. Both of the above graphs would be called relative 
absolute/ relative 
frequency distributions, since they show the p proportion 


of subjects who estimated each of the possible values. 


21. If Distribution A (above) had been the distribution of your 
data, the subjects would have been acting very strangely - 
According to Distribution A, many of the subjects over- 
estimated and many of the subjects under-estimated but 
very of the subjects made estimates close few 

to the true number of marbles in the jar. (Remember, 


there were 20 marbles in the jar.) 


22. According to the Distribution B, (above) none of the 
23 


17 


subjects' estimates was greater than 


or less than 
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23. 


24. 


25. 


26. 


You could describe the variability of the estimates in B 
by saying that "the range of the estimates was oH 


In other words, the range equals minus 


The proportion of students who estimated that there were 
more than 21 marbles in the jar is simply the proportion 
who estimated that there were 22 or more. If à of the 
students estimated 22 marbles, à of the students 
estimated 22 marbles, and none of the students estimated 
more than 23 marbles, you could say ` ofthe 
students estimated there were more than 21 marbles in 


the jar. 


The proportion of students who estimated more than 20 
marbles is simply the proportion of students who 
estimated there were 21 marbles, plus the proportion 
who estimated there were ^ marbles, plus the 


proportion who estimated there were marbles. 


Imagine that the distribution of estimates was as follows: 
(Notice that we have shaded the columns representing 


subjects who estimated more than marbles.) 
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29. 


30. 


We pointed out earlier that the sum of all the proportions 
in a proportional (relative) frequency distribution has to 
equal . Since the height of each column represents 
the proportion of students who estimated a particular 
value, the total height of all the columns added together 


must equal 


According to the preceding graph, is of the students 
estimated 18, m estimated 19, estimated 20, 


estimated 21, whereas the remaining İs of the students 


estimated 


Thus, the total of the proportions represented by all the 


columns is / 16, which equals 


If we added together the heights of the two shaded 
columns in the previous distribution, they would form a 
column whose height was equal to the proportion of 
students who estimated more than ^ ^ marbles. 
Notice that the proportion of students estimating more 


5 
than 20 equals plus ^ OT Zei 


In the following distribution we have shaded the columns 


representing students who estimated than 
fewer/ more 
20 marbles. 
8 
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31. 


32. 


33. 


(Continued) 


A column as high as the two shaded columns combined 
would represent the proportion of students who estimated 
fewer than marbles. This proportion would be 


In the following distribution we have shaded columns 
corresponding to people who estimated fewer than 
or more than marbles. 


PROPORTION 
SE 


— =. — 
ow g^ glo 


° 


ESTIMATE 


A subject who estimated 22 marbles would have made an 
error of 2 since there are actually 20 marbles in the jar. 
A subject who estimated 18 marbles would also have 
been in error by 2. The estimates indicated by the 
shaded columns in the previous distribution represent 


subjects who made an error of more than 


If the mean of the distribution of estimates was 20, an 
estimate of 22 would correspond to a positive deviation 


from the mean of 


Similarly an estimate of would correspond to a 


negative deviation of -2. 
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36. 


The shaded columns in the previous distribution 
indicate the proportion of subjects whose estimates had 
either a positive deviation from the mean of 

or more, or a negative deviation of ^ or more. 
(We use the words "or more" to indicate an estimate 
even farther away from the mean in either a positive 
or negative direction.) 


The shaded column in the following distribution 
represents the proportion of estimates that had a 
deviation from the mean estimate of 


positive/ negative 


= 
ale 
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oil 
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PROPORTION 
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ESTIMATE 


We will use the phrase absolute deviation or absolute 
error when we are interested only in the distance 
between a value and the mean and not in whether the 


deviation is positive or negative. 


In other words, when we are interested only in the 
difference between a value and the mean and not in 


whether this difference is positive or negative, we will 


use the phrase or 
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37. 


38. 


39. 


40. 


A value that deviates from the mean by 42 is the same 


distance away from the mean as a value having a deviation 


of -2. The value ten deviates from the value of eight by 
+2 (or simply 2). The value 6 deviates from 8 by -2. 
Both 6 and 10, however, are the same distance away 
from the value - This is what is meant when we 
say that the value 10 and the value 6 have the same 
absolute deviation from 8. 


We just used an illustration in which 100 students 

attempted to estimate the number of marbles in a jar 
containing 20 marbles. An estimate of 22 would have 
been just as accurate as an estimate of 18, since both 


estimates would have been in error by 


If the mean of the estimates was 20, an estimate of 22 
would represent a positive deviation of 2 and an estimate 
of 18 would represent a negative deviation of -2. 
Therefore, both an estimate of 22 and an estimate of 18 


would represent the same 


from the mean. 


Suppose the distribution of student estimates was as 


follows: 


10 
20 


PROPORTION 


16 17 18 19 20 21 22 23 24 25 
ESTIMATE 


205 


absolute deviatior 
(error) 


S6 


$8 


SZ “F ‘oz 


07 


PLS 


ucu) 19j€913 uorjeurrjse yo 10129 ejnposqe 
Ue 0) pep Avy Aqəzəu1 p[noA pue p- Jo 913S1763S əydures 
* ut Donat savy p[noA sə[dures 0017 90) JO “nur 


*uorutdo epqe10Ae] 

P pey sjoefqns oujjo ^  uoruA ut sə[dures əsoy} 
ərək Z: WEY} 1978913 uorjeurso jo 10119 ojn[osqe ue 01 
pər əAeq prnoa zey} peure3qo Artrenjoe so[dures Zo eur 


*SSe[ 40 g- Jo ejeurnso jo 10110 9jnposqe ue o) Got 
9A€U prno^ sərdures 001 ay} yo Appexo 'fqreprumg 


"Seat zo T° 

JO 9jeurso Jo 10116 əşnyosqe ue ur 3jsoa prnoA qorgA 
sərdures ^  — fpjoexo ərəm 8.194} sərdures 007 au Jo qmo 
yey} səşeərpur zəvşəSoş Sərəuənbə ry oam əsən) Surppyv 


+ AəAnəədsər pue “ € 


ER? 
Se[dures yo səd43 9914} əsən) yo sərouənbəzg 9uL “SSəT 
40 T° yo 10110 uorjeurnso 9jnposqe uv o) pər aavy prnom 


LON ` Jo suotj10do1d epdureg 


*uorrodoad uoneyndod ən) Jo əşeumsə 


peated z o) pər aavy prom Yorym sərdures 
Anəexə uəəq AP PINOM ao *uoryrodo.d uorjerndod 
91) Jo e3eurse ue se uopyrodozd epdures oy} Surrəpisuoo 


"ərdures yo ad 1oujo Aue PIP uey} Anuənbəzy 
əqour pərmnəəo suorurdo 9Iq€4104*] peu sjoe[qns oT 
əy} yo Anəexə QorqA ur sərdures mem əənoN 


“60T 


"801 


“LOT 


"90T 


“SOT 


*v0T 


| 


40. 
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42. 
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44. 


45. 


(Continued) 


The mean of this distribution is 20. Therefore, an 
estimate of 24 would represent a positive deviation of 
, and an estimate of 16 would represent a 


negative deviation of 


On the previous distribution graph, the shaded columns 
represent estimates having an absolute deviation of less 
than 


The unshaded columns in the previous graph indicate the 
proportion of estimates having an absolute deviation 


from the mean of more than 


In a bell-shaped distribution, large absolute deviations 
are frequent than small absolute deviations. 


more/ less 


The small absolute deviations are more frequent because 
most of the observed values are clustered around the 
mean in a bell-shaped distribution. Estimates much 
larger or much smaller than the mean have a 

absolute deviation and are less frequent 


large/ small 


than those values with absolute deviations . 


large/ small 


Students who estimated 25 marbles are unusual in the 
sense that few students made estimates that large. 
Similarly, students estimating only 15 marbles are also 
unusual, since there were very few estimates that small. 


In other words, the the absolute deviation 
larger/ smaller 
of a student 's estimate from the mean, the more unusual 


was his estimate. 
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4T. 


48. 


In the distribution shown below, we have shaded the 
columns representing the proportions of students who 


made either unusually or unusually 
estimates. 
10. 
20 
z 
s 
Ë 5 
È 30 
° 
m 
m 
0 


16 17 18 19 20 21 22 23 24 25 
ESTIMATE 


An estimate representing a large absolute deviation from 
the true number of marbles in the jar would be considered 


a estimate. 


good/ poor 


Suppose you conducted the marble estimation experiment 
with two groups of students, one group of 8-year-old 
students and another group of 18-year-old students. 
While you would expect the 18-year-old students to make 
some errors in their estimates, you would expect them 


to be more accurate than you would the younger students. 


In other words, although the older students would make 
some mistakes, you would expect the absolute deviations 
of their estimates from the true number of marbles to 


be generally than those of the younger 
larger/ smaller 


students. 
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48. 


49. 


(Continued) 


Accordingly, if the following distributions represented 
the distributions of estimates from the two groups, 


Distribution 1S pr obably the distribution of 


estimates for the younger group. 


DISTRIBUTION A 


PROPORTION 
OF STUDENTS 


17 18 19 20 21 22 23 
ESTIMATE 


DISTRIBUTION B 


PROPORTION 
OF STUDENTS 


17 18 19 20 21 22 23 
ESTIMATE 


Distribution B represents the older students ' estimates. 
It is clear that the older students tended to make more 
accurate estimates than did the younger students. The 
older students' estimates (Distribution B) tended to be 


closer to the true number of marbles than were the 
You could say that 
students ' estimates 


estimates of the younger students . 


the variability of the 
younger/ older 


was greater than that of the other group. 
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51. 


52. 


We could represent the difference in 

between the two distributions by the range. The range 
of Distribution A is and the range of 
Distribution B is 


Estimates of either — or would have an 
absolute deviation of 2 from the true number of marbles 
(20), since the first estimate would be 2 fewer than the 
actual value and the second estimate would be 2 greater 
than the actual value. 


The previous distributions are reproduced below. This 
time we have shaded the columns representing estimates 
whose absolute deviations from the true value of 20 were 


or more. 


PROPORTION 
OF STUDENTS 


17 18 19 20 21 22 23 
YOUNGER STUDENTS, ESTIMATES 
GRAPH À 


PROPORTION 
OF STUDENTS 


17 18 19 20 21 22 23 
OLDER STUDENTS ESTIMATES 
GRAPH B 
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53. 


54. 


55. 


56. 


57. 


One way of comparing the accuracy of the younger and 
older students‘ estimates is to compare the proportion 
of students in each group whose estimates differed from 
the true value by 2 or more. The proportion of students 
who made errors greater than one is indicated by the 
columns in the previous graphs. shaded 


shade/ unshaded 


Estimates whose absolute deviation from the true value 
of 20 were greater than one occurred more often in the 
group of students than they did in the other younger 


younger/ older 


group. 


Among the younger students, an estimate of 22 was 


unusual than an estimate of 21, since the more 
more/ less Š 
proportion of younger students who estimated 22 was 
than the proportion of younger students smaller 


greater/ smaller 


who estimated 21. 


An estimate of 22, however, was unusual for less 


“more/ less 
younger students than it was for older students, since 
the proportion of younger students who made estimates 
of 22 is than the proportion of older larger 


larger/ smaller 


students who made estimates of 22. 


Compared only with the rest of the people in his own age 


group, a young student who estimated 22 would not have 


performed quite so poorly as an older student who had 


estimated 22, since an estimate of 22 was more common 


(occurred more frequently) in the group younger 
younger/ older 


than it was in the other group - 
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58. 


59. 


Suppose you decided to give a prize to all the students 
whose estimates were within one marble of the true value. 
Or, to put it differently, suppose you decided to give a 
prize to all the students whose estimates had an absolute 


deviation from the true value of or less. 1 


The columns in the following graphs unshaded 


shaded/ unshaded 
indicate the proportions of students in each age group who 


would receive a prize. 
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60. 


61. 


If you only gave a prize to those younger students 
whose estimates were within 1 of the actual value, the 


unshaded columns in Graph (below) would indicate 
B 


the proportion of younger students who received a prize. 


PROPORTION 
OF STUDENTS 


17 18 19 20 21 22 23 
YOUNGER STUDENTS' ESTIMATES 


GRAPH A 


PROPORTION 
OF STUDENTS 


17 18 19 20 21 22 23 
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GRAPH B 


Since the older students were more accurate in 


estimating the number of marbles, of their 
fewer/ more 


estimates would meet the requirements for a prize than 


would those of the younger students because 
more/ less 


of the older students had estimates close enough to the 


true value. 
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62. 


The unshaded columns in the following graphs indicate 
the proportion of older students who would receive a 
prize if we only gave a prize to students who made errors 
of or less. 


The other graph indicates the proportion of younger 
Students would receive a prize if we gave a prize to all 
younger students who made an error of or less 


in their estimates. 
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62. 


63. 


64. 


65. 


(Continued) 


The proportion of younger students obtaining prizes and 
the proportion of older students obtaining prizes 
be approximately the same if we used 


would/ would not 


these two rules for awarding prizes. 


We could conclude from these considerations that a 
student whose estimate was within 1 


younger/older 

of the actual value was doing just about as well in 
relation to the rest of the people in his age group as was 
a(n) student whose estimate was within 2 


younger 7 older 


of the actual value. 


By considering the difference in variability between the 
two groups we established rules for giving prizes whereby 
approximately the same proportion of students received 
prizes in each group. Since large errors (absolute 
deviations) were more frequent in the younger group, 

a student would not be required to be 


younger/ older 


quite as accurate in order to win a prize as would a 


student in the other age group. 


The preceding example illustrates why it is often useful 
to consider the variability of a distribution when you are 
evaluating a particular observed value. For 


example, imagine you were teaching a course in 


psychology. You gave your students two examinations 


during the semester. Suppose there were 10 questions 
on each test and the student received either a score of 1 


core of 0 on each question. The possible total 
e between ^ and 


oras 
Score on each test would be somewher 
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65. 


66. 


67. 


68. 


(Continued) 


Suppose the results of these tests were those shown in 
the following frequency tables. 


Examination A Examination B 


Score Frequency 


E HN 
pa 
E S 
Ço «qo c» -10»0 MS cə tO I|! OO 
— 


According to these data, there were students in 


the class. 


There must have been 20 students in the class because 
the sum of the in each frequency 
table is 20. 


B 


be slightly larger than the variability on the other test. 
The range is one statistic which would represent this 


The variability of the scores on Test appears to 
A7 


difference in variability, since the range was on 


Test A and on Test B. 


2 
It also appears that o^ was larger on Test than on 
s A/B 


the other test. 
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69. 


70. 


71. 


72. 


The preceding absolute frequency distributions could be 
converted to relative frequency distributions by 
dividing each frequency by in order to convert 20 
it toa 
2 ; - proportion 


The two proportional distributions which you would obtain 
in this manner are shown in the following tables. 


Examination A Examination B 


=E er 


0 0 
1 1 
2 2 
3 3 
4 4 
5 5 
6 6 
7 7 
8 8 
9 9 
10 10 


Notice that H of the students received scores of 
3 or lower on Examination A, whereas / 20 of the 3 
students received scores of S or lower on Examination B. 


The mean of both previous distributions (Examination A 

and B) is 5 ` . A score of 5 would have a negative 

deviation of - i from the mean and a score of 6 would 
from the mean. 


vje 


have a positive deviation of 


olute deviation of 
and 5 5, 6 


The only two scores which i an abs 


i or less (from the mean of 5 D are 


One-half E of all 20 students received scores whose 


1 
absolute əsib from the mean was less than + 2 on 


Examination 
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73. 


74. 


75. 


76. 


The scores ə 6 are the only scores deviating from 
the mean by gor less on both examinations. The 
proportion of students who received scores of 5 or 6 
is equal to 

eee en 

2058 20 02:20) 
on Examination . 
B 
The proportion of students who received either a score 
of 5 or 6 on Examination B is equal to 4 ; 


which equals 


It was slightly more unusual for a subject's score to 
have an absolute deviation from the mean. of more than 


one-half on Examination than on the other 


B 
examination, since a higher proportion of students 


received grades of 5 or 6 on Examination 
B 


The absolute deviation of an observed value from the 
mean does not necessarily indicate how unusual that 
value was in relation to the rest of the distribution. If 
most of the values in the distribution were grouped very 
closely around the mean, thẹ proportion of values whose 
absolute deviation from the mean was greater than 1 
might be very small. On the other hand, if the 
distribution were quite variable, a much higher 


proportion of the values might have absolute deviations 


from the mean greater than 1. 
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76. (Continued) 


For example, consider the two distributions shown below. 
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01:128531415/76-1—9^ 9.10 
DISTRIBUTION B 


The mean of both distributions is 5. However, the 
variability of Distribution appears to be greater 
A/B 


than the variability of the other distribution. 
TT. While the mean of both distributions is 5, absolute 
deviations greater than 1 occur much more frequently 


in Distribution than they do in the other 
/ B 


distribution. 
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While the value 7 represents a deviation from the mean 
of in either distribution, values with deviations 
that large or larger were more unusual in Distribution 


than they were in the other distribution. 
A/B 


Statisticians describe an observed value in a way that 
takes into account the variability of the distribution. You 
saw earlier how an observed value could be represented 
by its deviation from the mean rather than by its actual 
value. In addition, it is sometimes useful to indicate 
the relationship between that deviation and the standard 
deviation of the distribution. For example, if the mean 
of a particular distribution were 10, the value 15 would 
have a deviation from the mean of ` -. If the standard 
deviation of the distribution were 5, you could say the 
value 15 deviates from the mean by exactly one standard 
deviation. Similiarly, since the value 20 has a deviation 
from the mean of 10, 20 would be exactly 


two/three 


standard deviations away from the mean if o = 5. 


The variance of a distribution is simply the typical 


(mean) squared deviation from the mean of that distribution. 


The standard deviation squared would equal the 
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Consider the following relative frequency distribution. 


VALUE PROPORTION 


It can be shown that the variance of this distribution is 


16. Therefore, the is equal 


to A/A6 or 4. 


The mean is also 4. Thus, the values 3 and 5 


are/are not 


more than one standard deviation away from the mean 
because the absolute deviation from the mean of 4 for 


both the value 3 and the value 5 is 


You could say the value 5 is i of a standard deviation 
away from the mean, since the value 5 represents a 
deviation from the mean of 1 and 1 is i the size of the 
standard deviation 4. Similiarly, the value 6 would be 


of a standard deviation from the mean, since 
SESCH 


4 z 
the value 6 deviates from the mean by 1 and 1 is 


the size of the standard deviation 4. 


We have described the distance or difference between a 
particular value and its mean as the 


of that value from that mean. 
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(Continued) 


If a value is smaller than the mean, you say that it has a 
deviation. If a value is larger than the 


positive/ negative 


mean, that value has a deviation. 


positive/ negative 


If the distribution had a mean of 6, the value 7 would 
represent a deviation of and the value 5 would 


represent a deviation of 


Suppose the distribution of your data were approximately 
"bell shaped." You would know that most of the observed 
values were clustered around the mean with progressively 
fewer and fewer observed values farther away from the 
mean. Inother words, the frequency of values having a 
deviation of 2 is probably greater than the frequency of 


values with a deviation of 
sel 


The values farthest away from the mean in a "bell- 
shaped" distribution are often referred to as the tails 
of the distribution(since the distribution appears to 
taper into a tail at the extreme values. For example, 
the areas in the following distribution 


shaded/ unshaded 


would be referred to as the tails of the distribution. 


FREQUENCY 


VALUES 
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92. 


The values from the absolute 
largest/ smallest 

deviations from the mean appear in the tails of a bell - 

shaped distribution. 


In a so-called "bell-shaped" distribution, the farther a 
value is from the mean (the more it deviates from the 
mean), the will be its frequency. 


larger/ smaller 


You should recall that the standard deviation (c) is a 
deviation which, squared, would equal the 
If the variance of a distribution were 9, a deviation of 4 


would be than one standard deviation. 
greater/ less 


If the mean of your distribution were 10 and the standard 
deviation of the distribution were 4, a score of 14 would 


have a deviation from the mean equal to standard 
1/72 


deviation(s) - 


Another way of indicating that a particular value deviates 
from the mean by one standard deviation is to say that 
that value equals a standard score of one. Thus, if the 
value 14 represents a standard score of one ina 
distribution with a mean of 10, the standard deviation of 


the distribution must be 


If the distribution had a standard deviation of 2 and a 
mean of 10, then the value would deviate from 
ean by one standard deviation. Therefore, 12 

of one. 


the m 


would represent as s 
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smaller 


variance 


greater 


12 


standard score 
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If a particular value were said to equal a standard score 
of 2, that value would have a deviation from the mean 
equal to twice the standard deviation of the distribution. 
If a particular value equaled a standard score of -2, that 
value would represent a negative deviation from the mean 


equal in size to twice the 


Below are a list of values forming a distribution whose 
mean is 10 and whose variance is 25. Since the variance 
is 25, the size of a standard deviation is . 


STANDARD 
VALUE DEVIATION FROM 10 SCORE 
15 
5 
10 


1 
“hi 
0 


Notice that the first value, 15, deviates from the mean 
of 10 by . Since the standard deviation of the 


distribution is 5, a deviation of 5 would be the same size 


as one standard deviation. This is all ve mean when we 


indicate (as we did in the previous table) that the value 
15 is equal to a standard score of . 


The value 5 in the previous distribution represents a 
negative deviation from the mean of -5 and, therefore, 


is equivalent to a standard score of 


"ues 
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Most of the values in a normal distribution are clustered 
around the mean, with fewer and fewer values farther 
away from the mean (i.e., the distribution is "bell- 
shaped"). Thus, values representing a Z-score larger 


than 2 would be frequent than values 
more/ less 


representing Z -scores less than 2. 


You have already seen that .95 of all the values in a 
normal distribution are within two standard deviations 
of the mean. Therefore, -95 of all the values ina 
normal distribution would represent Z -scores between 
-2 and 


Thus, a Z-score is a useful way of representing values 
in a normal distribution since it indicates how unusual 
such values are, regardless of the mean or variance of 
the distribution. No matter what the mean or variance 
of the normal distribution, you would know that a 
Z-score as large as 1 was frequent than a 


more/ less 


Z -score as large as 3- 


Suppose you were told that your score on the last 
Psychology examination was approximately equal to a 
Z-score oftwo. This would imply that the distribution 
imately normal and that your 
standard deviations above 


of test scores Was approx 
score was about 
the mean of the distribution. 


Furthermore, since only .05 of the values in a normal 


distribution are farther than two standard deviations 


m the mean, and since these extreme values are 
sitive deviations, 


of the students 


fro 
divided equally between negative and po: 


know that only about 
you would kno 057-025 
grade on the test. 


had made a 
better/ poorer 240 


less 


more 


.025 


better 
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152. 


To summarize then, a standard score (or Z-scores if the 
distribution is normal) are a convenient way of representing 
a value, since it indicates how many standard 


that value is away from the mean. deviations 


This is particularly useful to you in the case of a normal 

distribution, since you know exactly what proportion 
of the values in the distribution are within any 

particular number of standard deviations from the mean. 
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Section VI: Samples and Populations 


There are some things you can say about a collection of 

data even before you actually collect it. Suppose you 

were interested, for example, in the heights of students 

at a particular high school. You could determine the 

distribution of these heights by measuring and recording 

the height of each student in the high school. If there 

were 1, 500 students, your collection of data would 

consist of observations of a variable called 1, 500 
"height." 


Suppose you only knew the heights of ten of the 1, 500 

students. Although these ten observations form a collection 

of data, they could also be considered as part of the 

larger, complete collection of data. Statisticians use 

the name sample to describe a collection of data which is 

viewed as part of a larger, complete collection of data. 

In other words, the collection of heights 10 


10/1, 500 


would be considered a sample in this illustration. 


Statisticians refer to the complete collection of data. 
(of which the sample is a part) as a population. Thus, 
the heights of the ten students would be considered a 


sample, whereas the heights of the 1, 500 students would 


be considered a . population 
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Let's look at another example of the difference between 

a sample and a population. Suppose you wanted to know 

Which of two candidates for some public office was most 

preferred by each of the 10, 000 people in your city. If 

you asked the first 100 people you met on the street to 

state their preference, you would have a sample consisting 

of observations from the complete collection of 100 
data (population) consisting of observations. 10, 000 


Suppose you were interested in the yearly income of the 
200 school teachers in your city. A collection of data 
consisting of yearly incomes of only 5 of these teachers 


would be a if you viewed it as part sample 
sample/ population 
of tne consisting of the yearly incomes population 


sample/ population 


of all 200 teachers. 


It is important to realize that a particular collection of 
data could be treated as either a sample or population 
depending upon how you view it. Inthe previous 
illustration, for example, the complete collection of data 
consisting of the yearly income of each of the 200 
teachers in your city was viewed as a population. Suppose, 
however, you were interested in the yearly incomes 

of all the teachers in your state. A complete collection 
of data would now consist of a list of yearly incomes of 


all the teachers in the state. 


This list would now be the population, whereas the list 
the 200 teachers in your city now 


of yearly incomes for 
from this sample 


could be viewed as a 
sample/ population 


population. 
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10. 


Notice also that the yearly income for any 5 particular 


teachers in your city be viewed as 
could/ could not 


a sample from either a population consisting of the 
incomes of all the teachers in your city, or the larger 
population consisting of the yearly incomes of all the 
teachers in your state. 


A particular collection of data may be viewed as a 


sample/ population 


to a larger collection of data of which it is a part, or as 


if you are comparing that collection 


a if you are comparing it to a 
sample/ population 


smaller collection of data which would be included in it. 


To define a population, you coüld describe what would be 


included in the complete collection of data. For example, 


you could describe a collection of data consisting of the 
ages of all the Democratic presidents. You could 
describe another collection of data consisting of the ages 


of all the Republican presidents. 


Either one of these collections of data could be 
considered a population, but they be the 


would/ would not 


same population(s) - 


A particular collection of data may be a sample from 
more than one population. For example, a list of the 
ages of all the truck drivers in Detroit could be a 

s from the population consisting of the ages 
of all the truck drivers in the United States. The same 


collection of data could be considered a 
from a population consisting of allthe male workers in 


Detroit. 
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11. 


12. 


13. 


The ages of all the female school teachers in Detroit 


would/ would not 
consisting of the ages of all the male drivers in the 
United States. However, the ages of all the female 


School teachers in Detroit be a sample would 
would/ would not 


be a sample from a population would not 


from a population consisting of the ages of all the people 
employed in Detroit. 


Up to this point, we have only considered populations as 
collections of data that could actually be completely 
collected. For example, we discussed a population 
consisting of the heights of all the students in a 
particular high school. In principle, at least, it 
be possible to actually collect a would 


would/ would not 


complete list of the heights of all the students in the high 


school. 


It is sometimes useful to think of a particular collection 
of data as if it were a sample from a larger collection 

of data consisting of an unlimited number of observations. 
An example of a population consisting of an unlimited 
number of observations would be a list of the outcomes 

of an unlimited number of throws of a die (one member 


of a pair of dice). You could roll a die and record the 


number of dots showing on the face of the die as the 
"outcome" of this toss. You could then roll it again, 
and again, and again, each time recording the number 
of dots showing on the face of the die. In this way, you 


could go on producing a list of observed values, without 


ever specifying where you should stop. A list of out- 


comes for any specific group of tosses could be viewed 


from a longer, unlimited list of sample 


asa 
outcomes, since this smaller collection of data would 
be part of the larger, unlimited collection. 
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14. 


15. 


16. 


If a population consists of a specific number of 
observations, it is called a finite population. The hair 
color of every person in the United States is a finite 
population since, in principle at least, you could actually 


list the hair color or every person in the United States. 


Similarly, since there are a limited number of people in 
the world at any one time, a list of the heights of everyone 


in the world would consist of a number limited 


limited/ unlimited 


of observations and would therefore be a finite population. 


A population viewed as consisting of an unlimited number 

of observations is referred to as an infinite population. 

A population consisting of the yearly income of each 

person in a particular city be an infinite would not 


would/ would not 
population, since the number of people in the city is 


limited (in principle, you could actually count them all). 


Suppose you shuffled a deck of playing cards, dealt out 
five cards, and counted the number of red cards among 
the five. You could consider this number to be a single 
observation of a variable whose 6 possible values 
are: 0, 1, 2, 3, 4, and 5. You could shuffle the cards 

and count the number of 


again, deal out five more cards, 
red cards in this new group of five. 


this process indefinitely. (The procedure for generating 
ns does not specify any end to the 


You could continue 


this list of observatio 
list.) In other words, the possible number of 


observations would be . Any specific unlimited 
limited/ unlimited 
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16. 


17. 


18. 


19. 


(Continued) 


group of observations, therefore, could be considered to 


be a sample from a(n) population consisting 
finite/ infinite 


of the unlimited number of observations. 


We have seen that every collection of data describes a 
distribution. The distribution can either be described in 
absolute terms, by listing the frequency with which each 
value occurred in the data, or it can be described in 
relative terms, by listing the p of times 
each value occurred in the data. 


A distribution can be described by statistics such as the 
mean, the median, and the mode, all of which 

t ^ 
or a distribution can be described by statistics such as 


characterize its c 


the range and the variance, which describe its 


Vv 


While you may be interested in the distribution of a 
particular collection of data, for one reason or another 
you will often have only part of that complete collection. 
In other words, although you are really interested in a 


you may have only a 
population/ sample 
population/ sample 
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infinite 


proportion 


central tendency 


variability 


population 


sample 
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Suppose, for example, you were interested in the 

number of people in the United States with a college 

education. It would probably be too difficult, too time- 

consuming, and too expensive to actually list whether or 

not each person in the United States held a college 

degree. Suppose, however, you stopped twenty people 

on a street corner and asked them if they held a college 

degree. The data you collected would be only part of the 

data in which you were interested. Therefore, the 

twenty observations would be a from the sample 
consisting of the education of all population 


21. 


22. 


the people in the United States. 


Since there is a specific number of people in the United 
States, your list of twenty observations would be a 


sample from a population. 


finite/ infinite 


Suppose you found that four of the twenty people you had 
stopped on the street did hold college degrees. You 
could describe the distribution of your sample by the 


absolute frequency distribution shown in Graph 
so eq y 5 ; 


or by the relative frequency distribution shown in 


Graph 
A/B 
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24. 


26. 


27. 


If you felt that the sample of 20 people was typical or 
representative of the whole population, you might guess 
that the proportion of people in the United States with 
college degrees was about 


In other words, while you don't actually have a complete 
collection of data, you might feel that the sample is 
similar to the 


A guess at some statistic describing the population on 
the basis of a sample from that population is called a 
statistical inference. You made a 

about the proportion of college 
graduates in the United States on the basis of a sample 
of 20 people. 


A statistic describing a population is often called a 
population parameter. For example, if you had a 
complete collection of data listing whether or not each 
person in the United States had a college degree, you 
could calculate the true proportion of people with 
college degrees in'this population. Since the proportion 
would be a statistic describing a population, it would be 


called a population p . 


Any statistic describing a sample is called a sample 
statistic. Thus, the proportion of the people with 
degrees in your sample would bea 


sample statistic/ population parameter 
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28. 


29. 


30. 


Suppose you were a shoe manufacturer who had to decide 


what proportion of his total shoe production should be 
devoted to each possible shoe size. You would want the 
distribution of shoe sizes you produced to be as close as 
possible to the actual distribution of shoe sizes required 
by the people who could purchase your shoes, since it 
would not be efficient for you to produce more shoes 
then you could sell in one size and fewer shoes then 
could be sold in another size. In principle at least, you 
could measure the shoe size of every potential customer 
and thereby determine the actual distribution of shoe 
Sizes in the population. It probably would be out of the 
question, however, to actually determine the shoe sizes 
of all these people. Suppose you were able to obtain the 
Shoe sizes of 100 potential customers. You might 
guess something about the distribution of the 

based on the distribution of the 


sample/ population 
you had obtained. 


sample/ population 


In other words, on the basis of a sample of 100 shoe 
sizes from the population, you would be making a 
statistical in concerning the 
distribution of shoe sizes in the population of potential 


customers. 


The difference between the largest and the smallest shoe 
size in the sample of 100 shoe sizes would be the 


of the sample. 
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31. 


32. 


33. 


34. 


35. 


The difference between the largest and the smallest 
Shoe size in the population of all potential customers 


is a statistic that describes the and 


population/ sample 


would, therefore, be calledap ` statistic . 


The variance of your sample would be a 
sample statistic/ population parameter 


If you had a complete collection of data concerning the 
Shoe sizes of all potential customers, you could actually 
calculate the true variance of the population. This 
variance would be a population statistic or p 


Suppose you guessed that the population (parametric) 
mean was the same as the mean of your sample. You 


would be using a statistic as an 
population/ sample 
estimate of a statistic . 


population/ sample 


You would be making an inference about the 


on the basis of a . The difference between 


your estimate of the population mean and the true 
population mean would be your error of estimate. 


In other words, if the true mean were 10 and your estimate 


were 9,the error of estimate would be 10 - SOT es 
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population 


population 


sample statistic 


parameter 


sample 


population 


population 


sample 
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37. 


38. 


39. 


Consider the two collections of data shown below: 


10 


40 


Observation Value 


25 
10 
25 


Aë i 


TABLE B 


O cO co -10 Qn i> co to a 


= 


TABLE A 


Suppose the data in Table A represented a population and 
the data in Table B represented a sample from that 
population. The mode of the population is —  , 
whereas the mode of the sample is 


If you used the sample mode as an estimate of the 
population mode, your error of estimate would equal 


20 - WOR 


If the sample were typical, or representative, of the 
population, the difference between a sample statistic 
and the population statistic would be small. In other 
words, your e of e would 
be small if you used this sample statistic as your 


estimate of the population statistic . 


However, the sample statistic might be quite different 
from the population statistic and the error of estimate 


large/ small 


would then be 
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40. Some samples you obtained could be quite representative 
of the population because the sample statistic was very 
similar to the population statistic. However, you could 
also obtain samples which gave you a poor or distorted 
picture of the population. It would be useful to know how 
often different values of the sample statistic would occur 
if you collected one sample after another. Suppose there 
were 1000 students at the high school and you knew the 


height of five particular students. The collection of 


whereas 
sample/ population 


the collection of five heights would be a . 
sample/ population 


1000 heights would be your population 


sample 


41. Suppose you had twenty such samples from this 
population, where each sample was a collection of the 
heights of five students. If you calculated the mean of 
each of these samples the list of these twenty means 


would be a collection of twenty sample 
sample/ population 


statistics. 


42. The frequencies with which each possible value of the 
sample mean occurred in this collection of twenty 
sample statistics define a distribution of sample means. 
The distribution of these twenty sample means is called 
a sampling distribution, since it is a distribution of 


sample statistics. 


Instead of calculating the mean of each of the twenty 
samples, you could calculate the variance of each 


sample. The list of twenty sample variances would 


also be a collection of twenty statistics. sample 


Therefore, the distribution of these twenty sample 


variances would be a distribution. sampling 
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43. 


44. 


45. 


46. 


Similarly, you could calculate the median, the mode, the 


range, or the standard error (or any other sample 
Statistic) of these twenty samples. In each case, the 
distribution of the resulting twenty sample statistics 
would be called a s distribution. 


H you collected several samples from some population 
and calculated the mean of each sample, the following 
graph might be the frequency distribution of these 
sample means and would therefore be a 

distribution of sample means. 
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According to this.graph, there were samples, 


10/20 


since you calculated a mean for each sample. 


The largest (not the most frequent) sample mean was 
and the smallest sample mean was 


The most frequently occurring sample mean was 8 
since of the twenty samples had that mean. 


Therefore, the modal value of the sample mean Was — - 


254 


sampling 


sampling 


20 


Sec 


S š spenba 
OI ‘ST yoma “ B *eseo sty} ur “OX - ly Aldurs 
ST ugəur o UIOIJ T uorjeA1osqo jo uorjetaAop au] 


DI ° o) [enbo sr X 
01 fe1ioporeg[, :ueour əv) pur o Aq S14} epratp prnoo 
001 no4 “ SI SenpeA pƏAHƏSQO [[? JO 1820) am BOUTS 


01 
6 
8 
L 
9 
S 
v 
E 
06 [4 
S S I 


sz I 
ë - X) SÜTIVA | NOLLVANSUSSO 
:ə1qe} 


SurAO[[0] əm) ut UMOYS vjep ou pe3oerjoo nod asoddng 


"uonnqınsıp 


pədvus-rrəq am ^ a UT wou aq punore 
pəzəşsnrə əre sənr[%A ay} JO sour əəuls "uonerAop preputjs 

9uo Jo suoryerAəp vom şuənbəliy seat are (z- 10 Z yo 

SƏ109S preputjs) suorerAəp pepuejs OM} yo suoryerAəq 


“T- IO Į Jo so1oos 


S89] /atour 


ssəl prepuejs ore uey} Juənbəuy are Z- IO g JO 
Se1oos prepuels jpedeus og. ITE FEY} suonnqtzsrp uy 


Suorjeraop “USUI ə) UIOIJ SI on[eA qoeo 
piepuejs 


Augu moy əycərpur o) sta NOL 
Dou [njosn Sr 891008 prepuejs oj sənreA Surj19Auo;) 


“90T 


“SOT 


"YOT 


"£01 


"601 


E 


dalı dd A. 


4T. 


48. 


Suppose you were interested in a population consisting 

of the yearly incomes of all of the people in a particular 

city. If you determined the yearly income of ten people 

you stopped on the street in that city, the resulting list 

of ten incomes would be considered a from sample 
that population. 


Imagine you collected twenty samples,where each 
sample consisted of a list of the yearly incomes (to the 
nearest thousand dollars) of ten people. Furthermore, 
imagine you collected these twenty samples by stopping 
people in the lobby of the most expensive hotel in town. 
Suppose you collected another 20 samples on a street 
corner in the poorest section of town. The graphs 
shown below could indicate the distribution of sample 


means in each group of twenty samples. 
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RESPONSE MASK 


Tear this mask off and use 

it to cover the responses on the 
right-hand side of each page as 
you work through this programmed 


text. 


Study the first frame (statement). 
Then write your response in the 
space provided or on a separate 
sheet of paper. NOW move the 
mask down to reveal the printed 
(correct) response and check your 
answer against it. Remember to 


keep the other responses covered. 


Repeat this procedure for each 
frame. When you have completed 
a page, slip this mask over the 
responses on the next page and 


continue your studies. 
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